Recommended way of serializing Django RawQuerySet with non-model fields - django

Having query like SELECT *, 'hello' AS world FROM myApp_myModel I'd like to serialize it to json.
Doesn't seem like a big deal, and there are plenty of similar questions on SO but none seems to give straight answer.
So far I've tried:
data = myModel.objects.raw(query)
# gives: ModelState is not serializable
json.dumps([dict(r.__dict__) for r in data])
# doesn't serialize 'world' column, only model fields:
serializers.serialize('json', data)
#dear God:
for r in data:
for k in dict(r.__dict__):
print(getattr(r,k))

The issue:
Builtin django core serializers are not ready to include extra fields ( from raw neither from annotation expression) It just takes model fields from _meta.local_fields.
You can see it at django django/core/serializers/base.py source code:
concrete_model = obj._meta.concrete_model #obj is an object model
...
for field in concrete_model._meta.local_fields:
if field.serialize or field is pk_parent:
if field.remote_field is None:
if (self.selected_fields is None
or field.attname in self.selected_fields):
self.handle_field(obj, field)
else:
if (self.selected_fields is None
or field.attname[:-3] in self.selected_fields):
self.handle_fk_field(obj, field)
django rest framework at rescue:
To solve your issue you can use a non builtin functionality. You can include a REST package in your project. For example django rest framework can handle extra fields:
from django.db.models import F
from aula.apps.alumnes.models import MyModel
from rest_framework.renderers import JSONRenderer
data=MyModel.objects.annotate(dummy = F('some_field') )
class MyModelSerializer(serializers.ModelSerializer):
dummy = serializers.CharField()
class Meta:
model = MyModel
fields = ('some_other_field','dummy')
read_only_fields = (
'dummy',
)
m=MyModelSerializer(data, many=True)
JSONRenderer().render(m.data)

You can create a DRF searializer for the task:
http://www.django-rest-framework.org/api-guide/serializers/
i.e.
class MyModelSerializer(serializers.ModelSerializer):
world = serializers.ReadOnlyField()
class Meta:
model = MyModel
fields = (world, ...)
you can also use serializer inheritance etc - see the docs.

There is a clean way you can do this using Django Rest Framework
First off did you know You can also execute queries containing fields that aren’t defined on the model when doing a Raw query
for example ( REF )
>>> people = Person.objects.raw('SELECT *, age(birth_date) AS age FROM myapp_person')
>>> for p in people:
... print("%s is %s." % (p.first_name, p.age))
John is 37.
Jane is 42.
That means you can use a standard serializer. You just need to tell the serializer what to do with fields that were not originally on the model consider the below. Needed to join 3 tables to a user. The user, the company they belong to and the companies membership. If your table has thousands of users and you did the standard serialiser method field, it would result in thousands of queries to get the related companies membership each time. so instead here was the solution I used
# api.py
class UserSAMAExportListApiView(ListAPIView):
serializer_class = UserExportSerializer
model = User
def get_queryset(self):
q = User.objects.raw(
"""
SELECT
[users_user].[id],
[users_user].[email],
[companies_company].[title] AS company__title,
[companies_company].[registration_number] AS company__registration_number,
[memberships_membership].number AS company__membership__number
FROM [users_user]
LEFT OUTER JOIN [dbo].[companies_company]
ON ([users_user].[company_id] = [companies_company].[id])
LEFT OUTER JOIN [memberships_membership]
ON ([companies_company].[id] = [memberships_membership].[company_id])
WHERE ([memberships_membership].[expiry_date] >= %s)
"""
, [date.today(),]
)
return q
Then just tell your standard serialiser that there are some new fields you defined
# serializers.py
class UserExportSerializer(ModelSerializer):
class Meta:
model = User
fields = (
'id',
'email',
'company__title',
'company__registration_number',
'company__membership__number',
)
def build_unknown_field(self, field_name, model_class):
"""
Return a two tuple of (cls, kwargs) to build a serializer field with. For fields that werent originally on
The model
"""
return fields.CharField, {'read_only': True}
And that's it DRF will handle the rest in a standard way and do proper serialization for you
Note you have to override the build_unknown_fields method. This is simply saying convert all the non-standard model fields to Text, if you want you can check the field name and convert to other formats here.

Related

Speeding up Django Rest Framework Model Serializer N+1 Query problem

I have a DRF ModelSerializer class that serializes anOrder model. This serializer has a field:
num_modelA = serializers.SerializerMethodField()
`
def get_num_modelA(self, o):
r = ModelA.objects.filter(modelB__modelC__order=o).count()
return r
Where ModelA has a ForeignKey field modelB, ModelB has a ForeignKey field modelC, and ModelC has a ForeignKey field order.
The problem with this is obviously that for each order that gets serialized it makes an additional query to the DB which slows performance down.
I've implemented a static method setup_eager_loading as described here that fixed the N+1 query problem for other fields I was having.
#staticmethod
def setup_eager_loading(queryset):
# select_related for "to-one" relationships
queryset = queryset.select_related('modelD','modelE')
return queryset
My idea was I could use prefetch_related as well to reduce the number of queries. But I am unsure how to do this since Order and ModelA are separated by multiple foreign keys. Let me know if any other information would be useful
You can work with an annotation:
from django.db.models import Count
# …
#staticmethod
def setup_eager_loading(queryset):
# select_related for "to-one" relationships
return queryset.select_related('modelD','modelE').annotate(
num_modelA=Count('modelC__modelB__modelA')
)
in the serializer for your Order, you can then use num_modelA as an IntegerField:
from rest_framework import serializers
class OrderSerializer(serializers.ModelSerializer):
num_modelA = serializers.IntegerField()
class Meta:
model = Order
fields = ['num_modelA', 'and', 'other', 'fields']

Reload choices dynamically when using MultipleChoiceFilter

I am trying to construct a MultipleChoiceFilter where the choices are the set of possible dates that exist on a related model (DatedResource).
Here is what I am working with so far...
resource_date = filters.MultipleChoiceFilter(
field_name='dated_resource__date',
choices=[
(d, d.strftime('%Y-%m-%d')) for d in
sorted(resource_models.DatedResource.objects.all().values_list('date', flat=True).distinct())
],
label="Resource Date"
)
When this is displayed in a html view...
This works fine at first, however if I create new DatedResource objects with new distinct date values I need to re-launch my webserver in order for them to get picked up as a valid choice in this filter. I believe this is because the choices list is evaluated once when the webserver starts up, not every time my page loads.
Is there any way to get around this? Maybe through some creative use of a ModelMultipleChoiceFilter?
Thanks!
Edit:
I tried some simple ModelMultipleChoice usage, but hitting some issues.
resource_date = filters.ModelMultipleChoiceFilter(
field_name='dated_resource__date',
queryset=resource_models.DatedResource.objects.all().values_list('date', flat=True).order_by('date').distinct(),
label="Resource Date"
)
The HTML form is showing up just fine, however the choices are not accepted values to the filter. I get "2019-04-03" is not a valid value. validation errors, I am assuming because this filter is expecting datetime.date objects. I thought about using the coerce parameter, however those are not accepted in ModelMultipleChoice filters.
Per dirkgroten's comment, I tried to use what was suggested in the linked question. This ends up being something like
resource_date = filters.ModelMultipleChoiceFilter(
field_name='dated_resource__date',
to_field_name='date',
queryset=resource_models.DatedResource.objects.all(),
label="Resource Date"
)
This also isnt what I want, as the HTML now form is now a) displaying the str representation of each DatedResource, instead of the DatedResource.date field and b) they are not unique (ex if I have two DatedResource objects with the same date, both of their str representations appear in the list. This also isnt sustainable because I have 200k+ DatedResources, and the page hangs when attempting to load them all (as compared to the values_list filter, which is able to pull all distinct dates out in seconds.
One of the easy solutions will be overriding the __init__() method of the filterset class.
from django_filters import filters, filterset
class FooFilter(filterset.FilterSet):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
try:
self.filters['user'].extra['choices'] = [(d, d.strftime('%Y-%m-%d')) for d in sorted(
resource_models.DatedResource.objects.all().values_list('date', flat=True).distinct())]
except (KeyError, AttributeError):
pass
resource_date = filters.MultipleChoiceFilter(field_name='dated_resource__date', choices=[], label="Resource Date")
NOTE: provide choices=[] in your field definition of filterset class
Results
I tested and verified this solution with following dependencies
1. Python 3.6
2. Django 2.1
3. DRF 3.8.2
4. django-filter 2.0.0
I used following code to reproduce the behaviour
# models.py
from django.db import models
class Musician(models.Model):
name = models.CharField(max_length=50)
def __str__(self):
return f'{self.name}'
class Album(models.Model):
artist = models.ForeignKey(Musician, on_delete=models.CASCADE)
name = models.CharField(max_length=100)
release_date = models.DateField()
def __str__(self):
return f'{self.name} : {self.artist}'
# serializers.py
from rest_framework import serializers
class AlbumSerializer(serializers.ModelSerializer):
artist = serializers.StringRelatedField()
class Meta:
fields = '__all__'
model = Album
# filters.py
from django_filters import rest_framework as filters
class AlbumFilter(filters.FilterSet):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.filters['release_date'].extra['choices'] = self.get_album_filter_choices()
def get_album_filter_choices(self):
release_date_list = Album.objects.values_list('release_date', flat=True).distinct()
return [(date, date) for date in release_date_list]
release_date = filters.MultipleChoiceFilter(choices=[])
class Meta:
model = Album
fields = ('release_date',)
# views.py
from rest_framework.viewsets import ModelViewSet
from django_filters import rest_framework as filters
class AlbumViewset(ModelViewSet):
serializer_class = AlbumSerializer
queryset = Album.objects.all()
filter_backends = (filters.DjangoFilterBackend,)
filter_class = AlbumFilter
Here I've used the django-filter with DRF.
Now, I populated some data through Django Admin console. After that, the album api become as below,
and I got the release_date as
Then, I added new entry through Django admin -- (Screenshot) and I refresh the DRF API endpoint and the possible choices became as below,
I have looked into your problem and I have following suggestions
The Problem
You have got the problem right. Choices for your MultipleChoiceFilter are calculated statically whenever you run server.Thats why they don't get updated dynamically whenever you insert new instance in DatedResource.
To get it working correctly, you have to provide choices dynamically to MultipleChoiceFilter. I searched in documentation but did not find anything regarding this. So here is my solution.
The solution
You have to extend MultipleChoiceFilter and create your own filter class. I have created this and here it is.
from typing import Callable
from django_filters.conf import settings
import django_filters
class LazyMultipleChoiceFilter(django_filters.MultipleChoiceFilter):
def get_field_choices(self):
choices = self.extra.get('choices', [])
if isinstance(choices, Callable):
choices = choices()
return choices
#property
def field(self):
if not hasattr(self, '_field'):
field_kwargs = self.extra.copy()
if settings.DISABLE_HELP_TEXT:
field_kwargs.pop('help_text', None)
field_kwargs.update(choices=self.get_field_choices())
self._field = self.field_class(label=self.label, **field_kwargs)
return self._field
Now you can use this class as replacement and pass choices as lambda function like this.
resource_date = LazyMultipleChoiceFilter(
field_name='dated_resource__date',
choices=lambda: [
(d, d.strftime('%Y-%m-%d')) for d in
sorted(resource_models.DatedResource.objects.all().values_list('date', flat=True).distinct())
],
label="Resource Date"
)
Whenever instance of filter will be created choices will be updated dynamically. You can also pass choices statically (without lambda function) to this field if want default behavior.

Exclude related fields in model._meta.get_fields()

I have a model currently defined like this:
class Category(models.Model):
ID = models.AutoField()
name = models.CharField()
desc = models.CharField()
Another model Subcategory has a ForeignKey defined on Category.
When I run:
Category._meta.get_fields()
I get:
(<ManyToOneRel: siteapp.subcategory>, <django.db.models.fields.AutoField: ID>, <django.db.models.fields.CharField: name>, <django.db.models.fields.CharField: desc>)
However, I don't want the ManyToOneRel fields; I just want the others.
Currently, I am doing something like this:
from django.db.models.fields.reverse_related import ManyToOneRel
field_list = []
for field in modelClass._meta.get_fields():
if not isinstance(field, ManyToOneRel):
field_list.append(field)
However, is there a better way to do this, with or without using the model _meta API?
You could use the concrete_fields property.
Category._meta.concrete_fields
However this is an internal Django API, and it may be better to use get_fields() with your own filtering, even though it may be a little more verbose.
I had the same issue creating a serializer mixin that treated only GenericRelaTion fields. Unfortunately, when you use the get_fields() the ManyToOneRel in somecases appear principally when you need to get attname of a field. Therefore I created a function that treates this issue skiping all the ManyToOneRel from fields:
def get_generic_relation_fields(self):
"""
This function returns all the GenericRelation
fields needed to return the values that are
related such as polymorphic models.
"""
other_field = [field for field in self.Meta.model._meta.get_fields()]
fields = [field.attname for field in self.Meta.model._meta.get_fields() if not isinstance(field, (ManyToOneRel))]
generic_relation_fields = []
for field in fields:
get_type = self.Meta.model._meta.get_field(field)
field_type = get_type.__class__.__name__
if field_type == "GenericRelation": #<----- change field name to filter different fields.
generic_relation_fields.append(field)
return generic_relation_fields
Usage:
class MyModel(models.Model):
. . .
first_name = GenericRelation(MyUserPolymorphic)
last_name = GenericRelation(MyUserPolymorphic)
whatever = GenericRelation(MyUserPolymorphic)
class MyModelSerializer(serializer.ModelSerializer)
def create(self, validated_data):
fields = [field for field in get_generic_relation_fields(self)]
print("FIELDS ---->", fields)
. . .
output on POST:
FIELDS ----> ['first_name', 'last_name', 'whatever']
However, there I added 'GenericRelation' field you can add other fields to be filtered and treated as you want.
I have been looking for a solution for this, and I've ended up writing my own script which while not the most clean pythonic code, works. I thought to put it here if someone stumbles upon it in the future.
[field for field in fields if str(type(field)) != "<class 'django.db.models.fields.related.ForeignKey'>"]
I have also used this script as follows to get a dict in the format of {field_name:field_value}
{
field.name: (
getattr(self, field.name)
if str(type(field))
!= "<class 'django.db.models.fields.related.ForeignKey'>"
else getattr(self, field.name).id
)
for field in fields
if str(type(field))
!= "<class 'django.db.models.fields.reverse_related.ManyToOneRel'>"}
I had to do an additional check for ForeignKey constraints as usually showing the field value didn't make sense in that case

Django Rest Framework – Custom Hyperlink field in serializer

How can I add a custom hyperlink field in a serializer? I would like to have a hyperlink field in my serializer that has query params in it. Since there is no way to pass query params from HyperlinkedRelatedField or HyperlinkedIdentityField as far as I know, I've tried using a SerializerMethodField. However, this only serializes to a string, and is not a clickable URL when I visit the API through my browser. My code looks something like this:
class MySerializer(serializers.HyperlinkedModelSerializer):
custom_field = serializers.SerializerMethodField()
class Meta:
model = MyModel
fields = ('url', 'custom_field')
def get_custom_field(self, obj):
result = '{}?{}'.format(
reverse('my-view'),
urllib.urlencode({'param': 'foo'})
)
return result
Also, I am having trouble understanding the difference between a HyperlinkedRelatedField and a HyperlinkedIdentityField, so a brief explanation would be appreciated.
This should do the trick:
from rest_framework.reverse import reverse
class MySerializer(serializers.HyperlinkedModelSerializer):
custom_field = serializers.SerializerMethodField()
class Meta:
model = MyModel
fields = ('url', 'custom_field')
def get_custom_field(self, obj):
result = '{}?{}'.format(
reverse('my-view', args=[obj.id], request=self.context['request']),
'param=foo'
)
return result
The reverse function in rest_framework takes a view name (whatever view you'd like to link to), either an args list (the object id, in this case) or kwargs, and a request object (which can be accessed inside the serializer at self.context['request']). It can additionally take a format parameter and any extra parameters (as a dictionary) that you want to pass to it.
The reverse function then builds a nice, fully-formed URL for you. You can add query params to it by simply adding as many ?{}&{}&{} to your result variable and then filling in the series of query params beneath the 'param=foo' inside your format function with whatever other params you want.
The HyperlinkedIdentityField is used on the object itself that is being serialized. So a HyperlinkedIdentifyField is being used in place of your primary key field on MyModel because you are using a HyperlinkedModelSerializer which creates a HyperlinkedIdentityField for the pk of the object itself being serialized.
The HyperlinkedRelatedField is used to define hyperlinked relationships to RELATED objects. So if there were a MySecondModel with a foreign key relationship to MyModel and you wanted to have a hyperlink on your MyModel serializer to all the related MySecondModel objects you would use a HyperlinkedRelatedField like so (remember to add the new field to your fields attribute in Meta):
class MySerializer(serializers.HyperlinkedModelSerializer):
custom_field = serializers.SerializerMethodField()
mysecondmodels = serializers.HyperlinkedRelatedField(
many=True
read_only=True,
view_name='mysecondmodel-detail'
)
class Meta:
model = MyModel
fields = ('url', 'custom_field', 'mysecondmodels')
def get_custom_field(self, obj):
result = '{}?{}'.format(
reverse('my-view', args=[obj.id], request=self.context['request']),
'param=foo'
)
return result
If it were a OneToOneField rather than ForeignKey field on MySecondModel then you would set many=False.
Hope this helps!

Django ORM access User table through multiple models

views.py
I'm creating a queryset that I want to serialize and return as JSON. The queryset looks like this:
all_objects = Program.objects.all()
test_data = serializers.serialize("json", all_objects, use_natural_keys=True)
This pulls back everything except for the 'User' model (which is linked across two models).
models.py
from django.db import models
from django.contrib.auth.models import User
class Time(models.Model):
user = models.ForeignKey(User)
...
class CostCode(models.Model):
program_name = models.TextField()
...
class Program(models.Model):
time = models.ForeignKey(Time)
program_select = models.ForeignKey(CostCode)
...
Question
My returned data has Time, Program, and CostCode information, but I'm unable to query back the 'User' table. How can I get back say the 'username' (from User Table) in the same queryset?
Note: I've changed my queryset to all_objects = Time.objects.all() and this gets User info, but then it doesn't pull in 'CostCode'. My models also have ModelManagers that return the get_by_natural_key so the relevant fields appear in my JSON.
Ultimately, I want data from all four models to appear in my serialized JSON fields, I'm just missing 'username'.
Here's a picture of how the JSON object currently appears in Firebug:
Thanks for any help!
It seems a bit heavyweight at first glance but you could look at using Django REST Framework:
http://www.django-rest-framework.org/api-guide/serializers#modelserializer
You can define and use the serializer classes without having to do anything else with the framework. The serializer returns a python dict which can then be easily dumped to JSON.
To get all fields from each related model as nested dicts you could do:
class ProgramSerializer(serializers.ModelSerializer):
class Meta:
model = Program
depth = 2
all_objects = Program.objects.all()
serializer = ProgramSerializer(all_objects, many=True)
json_str = json.dumps(serializer.data)
To customise which fields are included for each model you will need to define a ModelSerializer class for each of your models, for example to output only the username for the time.user:
class UserSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = ('username', )
class TimeSerializer(serializers.ModelSerializer):
"""
specifying the field here rather than relying on `depth` to automatically
render nested relations allows us to specify a custom serializer class
"""
user = UserSerializer()
class Meta:
model = Time
class ProgramSerializer(serializers.ModelSerializer):
time = TimeSerializer()
class Meta:
model = Program
depth = 1 # render nested CostCode with default output
all_objects = Program.objects.all()
serializer = ProgramSerializer(all_objects, many=True)
json_str = json.dumps(serializer.data)
What you really want is a "deep" serialization of objects which Django does not natively support. This is a common problem, and it is discussed in detail here: Serializing Foreign Key objects in Django. See that question for some alternatives.
Normally Django expects you to serialize the Time, CostCode, Program, and User objects separately (i.e. a separate JSON array for each) and to refer to them by IDs. The IDs can either be the numeric primary keys (PKs) or a "natural" key defined with natural_key.
You could use natural_key to return any fields you want, including user.username. Alternatively, you could define a custom serializer output whatever you want there. Either of these approaches will probably make it impossible to load the data back into a Django database, which may not be a problem for you.