Django specify which database to use for module - django

In my Django project I have a couple of applications, one of them is email_lists and this application does a lot of data handling reading data from the Model Customers. In my production environment I have two databases: default and read-replica. I would like all queries in a particular module to be made against the replica-set database.
I can do that if I explicitly tell the query to do so:
def get_customers(self):
if settings.ENV == 'production':
customers = Customer.objects.using('read-replica').filter()
else:
customers = Customer.objects.filter()
but this module has more than 100 queries to the Customer and other models. I also have queries to relations like:
def get_value(self, customer):
target_sessions = customer.sessions.filter(status='open')
carts = Cart.objects.filter(session__in=target_sessions)
the idea is that I want to avoid writing:
if settings.ENV == 'production':
instance = Model.objects.using('read-replica').filter()
else:
instance = Model.objects.filter()
for every query. There are other places in my project that do need to read from default database so it can't be a global setting. I just need this module or file to read using the replica.
Is this possible in Django, are there any shortcuts ?
Thanks

You can read on django database routers for this, some good examples can be found online as well and they should be straightforward.
--
Another solution would be to modify the Model manager.
from django.db import models
class ReplicaRoutingManager(models.Manager):
def get_queryset(self):
queryset = super().get_queryset(self)
if settings.ENV == 'production':
return queryset.using('read-replica')
return queryset
class Customer(models.Model):
...
objects = models.Manager()
replica_objects = ReplicaRoutingManager()
with this, you can just use the normal Customer.objects.filter and the manager should do the routing.
I still suggest going with the database router solution, and creating a custom logic in the class. But if the manager works for you, its fine.

If you want All the queries in the email_lists app to query read-replica, then a router is the way to go. If you need to query different databases within the same app, then #ibaguio's solution is the way to go. Here's a basic router example similar to what I'm using:
project/database_routers.py
MAP = {'some_app': 'default',
'some_other_app': 'default',
'email_lists': 'read-replica',}
class DatabaseRouter:
def db_for_read(self, model, **hints):
return MAP.get(model._meta.app_label, None)
def db_for_write(self, model, **hints):
return MAP.get(model._meta.app_label, None)
def allow_relation(self, object_1, object_2, **hints):
database_object_1 = MAP.get(object_1._meta.app_label)
database_object_2 = MAP.get(object_2._meta.app_label)
return database_object_1 == database_object_2
def allow_migrate(self, db, app_label, model=None, **hints):
return MAP.get(app_label, None)
In settings.py:
DATABASE_ROUTERS = ['project.database_router.DatabaseRouter',]
It looks like you only want it in production, so I would think you could add it conditionally:
if ENV == 'production':
DATABASE_ROUTERS = ['project.database_router.DatabaseRouter',]

Related

How can I improve pagination performance in multiple queryset django?

I have 2 queryset :
queryset_primary = PrimaryUserSerializer(FileUpload.objects.all().order_by('name'), many=True, context=context).data
queryset_secondary = MemberSerializer(Members.objects.all().order_by('member_name'), many=True, context=context).data
Both having different keys ..so that I iterated both querysets :
response = []
for primary in queryset_primary:
# name_pri=primary['primary_user_id']
new_data={
'user_id' : primary['primary_user_id'],
'name': primary['name'],
}
response.append(new_data)
for secondary in queryset_secondary:
new_data={
'user_id' : secondary['secondary_user_id'],
'name': secondary['member_name'],
}
Again I used a common serializer having similar keys in it, for pagination :
responses = self.paginate_queryset(response)
if responses is not None:
serializer = CommonUserSerializer(responses,many=True)
data = {
'code': 200,
'status': "OK",
}
page_nated_data = self.get_paginated_response(serializer.data).data
data.update(page_nated_data)
data['response'] = data.pop('results')
return Response(data)
It totally taking 8 seconds of loading time.
How can I reduce the API loading time ?
When exposing bulk data, models and serializers should be avoided where possible. Have a look at Django's ORM .values() option to make a specific select. Try to construct your own structures for json serialization. I'd also avoid embedding nested data in your structures (especially in list views). It might turn out much cheaper to have a client call twice or filter your api then returning 'complete' data at once.
Have you tried to profile the specific api call? Django's debug toolbar is a convenient tool to pin point the bottleneck.
Have a look at .values() in the ORM:
https://docs.djangoproject.com/en/3.0/ref/models/querysets/#values
Members.objects.all().order_by('member_name').values()
This should give you a QuerySet that returns dictionaries, rather than model instances, when used as an iterable.
The you could couple this with:
from django.http import JsonResponse
def some_view(request):
data = list(Members.objects.all().order_by('member_name').values()) # Use list(), because QuerySet is not JSON serializable by default!
return JsonResponse(data, safe=False)
More specific to your question:
from rest_framework import status
from rest_framework.response import Response
def some_view(request):
merged = []
merged.append(
list(Members.objects.all().order_by('member_name').values('user_id'. 'name')) + list(FileUpload.objects.all().order_by('name').values('user_id'. 'name'))
)
responses = self.paginate_queryset(merged)
if responses is not None:
serializer = CommonUserSerializer(responses, many=True)
data = {
'success': true,
}
paginated_data = self.get_paginated_response(serializer.data).data
data.update(paginated_data)
data['response'] = data.pop('results')
return Response(data, status=status.HTTP_200_OK)
You will also need to do some annotations() to get your user_id attributes to primary_user_id and secondary_user_id
Much more efficient!

Disabling options in django-autocomplete-light

Just started using django-autocomplete-light (autocomplete.ModelSelect2) and while I have managed to get it working, I wondered if it is possible to pass disabled options?
I have a list of customers to choose from but some, for various reasons, shouldn't be selected they shouldn't be able to use them. I know I could filter these non-selectable customers out, but this wouldn't be very usable as the user might think that the customer isn't in the database. If so, could someone point me in the right direction as I'm not sure where to start.
It says in the Select2 documentation that disabling options should be possible. Presumably if I could also send a 'disabled':true within the returned json response that might do it.
OK, so here is what I came up with and it works.
view.py
The Select2ViewMixin is subclassed and then a 'disabled' attribute is added to the customer details. This value provided by the ParentAutocomplete view.
from dal import autocomplete
from dal_select2.views import Select2ViewMixin
from dal.views import BaseQuerySetView
class CustomSelect2ViewMixin(Select2ViewMixin):
def get_results(self, context):
return [
{
'id': self.get_result_value(result),
'text': self.get_result_label(result),
'selected_text': self.get_selected_result_label(result),
'disabled': self.is_disabled_choice(result), # <-- this gets added
} for result in context['object_list']
]
class CustomSelect2QuerySetView(CustomSelect2ViewMixin, BaseQuerySetView):
"""Adds ability to pass a disabled property to a choice."""
class ParentAutocomplete(CustomSelect2QuerySetView):
def get_queryset(self):
qs = Customer.objects.all()
if self.q:
qs = qs.filter(org_name__icontains=self.q)
return qs.order_by('org_name', 'org_city')
def get_result_label(self, item):
return item.selector_name
def get_selected_result_label(self, item):
return item.selector_name
def is_disabled_choice(self, item): # <-- this is where we determine if the record is selectable or not.
customer_id = self.forwarded.get('customer_id', None)
return not (item.can_have_children and not str(item.pk) == customer_id)
form.py
The form is then used as normal.
from dal import autocomplete
class CustomerBaseForm(forms.ModelForm):
customer_id= forms.IntegerField(required=False, widget=forms.HiddenInput)
class Meta:
model = Customer
widgets = {
'parent':autocomplete.ModelSelect2(
url='customer:parent-autocomplete',
forward=['customer_id'],
)
}
Hopefully this might be useful to someone.

Filtering OrganizationUser's by Organization in Django-Organizations

There is a relatively similar thread on this topic, but I can't seem to figure out how to translate it to my situation. I have a roster that I need to only display the organizationusers within the same organization of the viewer. I have a webapp that I am developing that is used to manage volunteers in an organization. I'm still new to backend development so I'm having trouble problem solving.
This is the code for the table view using Django_Tables2 package:
#tables.py
class VolunteerTable(tables.Table):
class Meta:
model = OrganizationUser
# views.py
def VolunteerRoster(request):
table = tables.VolunteerTable(OrganizationUser.objects.all())
return render(request, 'staff/roster.html', {'table': table})
I'm trying to figure out how to either convert the view to a class-based view so I can use the OrganizationMixin and the SingleTableView in Django_Tables2's documentation.
I was thinking about something like this based on the other threads explanation
class VolunteerRoster(SingleTableView, OrganizationMixin):
table_class = VolunteerTable
queryset = OrganizationUser.objects.all()
template_name = "staff_roster.html"
def get_queryset(self):
return self.queryset.filter(organization=self.get_organization())
When I try this I get: "TypeError: init() takes 1 positional argument but 2 were given"
As I said, I'm still new to django so I'm not really sure what to fix in this instance.
Try:
def get_queryset(self):
return OrganizationUser.objects.filter(organization=self.request.user.organization)

Django REST Framework - Filtering

I want to filter multiple fields with multiple queries like this:
api/listings/?subburb=Subburb1, Subburb2&property_type=House,Apartment,Townhouse,Farm .. etc
Are there any built in ways, I looked at django-filters but it seems limited, and I think I would have to do this manually in my api view, but its getting messy, filtering on filters on filters
filtering on filters on filters is not messy it is called chained filters.
And chain filters are necessary because sometime there is going to be property_type some time not:
if property_type:
qs = qs.filter(property_type=property_type)
If you are thinking there is going to be multiple queries then not, it will still executed in one query because queryset are lazy.
Alternatively you can build a dict and pass it just one time:
d = {'property_type:': property_type, 'subburb': subburb}
qs = MyModel.objects.filter(**d)
Complex filters are not out of the box supported by DRF or even by django-filter plugin. For simple cases you can define your own get_queryset method
This is straight from the documentation
def get_queryset(self):
queryset = Purchase.objects.all()
username = self.request.query_params.get('username', None)
if username is not None:
queryset = queryset.filter(purchaser__username=username)
return queryset
However this can quickly become messy if you are supported multiple filters and even some of them complex.
The solution is to define a custom filterBackend class and a ViewSet Mixin. This mixins tells the viewset how to understand a typical filter backend and this backend can understand very complex filters all defined explicitly, including rules when those filters should be applied.
A sample filter backend is like this (I have defined three different filters on different query parameters in the increasing order of complexity:
class SomeFiltersBackend(FiltersBackendBase):
"""
Filter backend class to compliment GenericFilterMixin from utils/mixin.
"""
mapping = {'owner': 'filter_by_owner',
'catness': 'filter_by_catness',
'context': 'filter_by_context'}
def rule(self):
return resolve(self.request.path_info).url_name == 'pet-owners-list'
Straight forward filter on ORM lookups.
def filter_by_catness(self, value):
"""
A simple filter to display owners of pets with high catness, canines excuse.
"""
catness = self.request.query_params.get('catness')
return Q(owner__pet__catness__gt=catness)
def filter_by_owner(self, value):
if value == 'me':
return Q(owner=self.request.user.profile)
elif value.isdigit():
try:
profile = PetOwnerProfile.objects.get(user__id=value)
except PetOwnerProfile.DoesNotExist:
raise ValidationError('Owner does not exist')
return Q(owner=profile)
else:
raise ValidationError('Wrong filter applied with owner')
More complex filters :
def filter_by_context(self, value):
"""
value = {"context_type" : "context_id or context_ids separated by comma"}
"""
import json
try:
context = json.loads(value)
except json.JSONDecodeError as e:
raise ValidationError(e)
context_type, context_ids = context.items()
context_ids = [int(i) for i in context_ids]
if context_type == 'default':
ids = context_ids
else:
ids = Context.get_ids_by_unsupported_contexts(context_type, context_ids)
else:
raise ValidationError('Wrong context type found')
return Q(context_id__in=ids)
To understand fully how this works, you can read up my detailed blogpost : http://iank.it/pluggable-filters-for-django-rest-framework/
All the code is there in a Gist as well : https://gist.github.com/ankitml/fc8f4cf30ff40e19eae6

Django data sharding

I have successfully got my application running over several databases using the routing scheme based on models. I.e. model A lives on DB A and model B lives on DB B. I now need to shard my data. I am looking at the docs and having trouble working out how to do it as the same model needs to exist on multiple database servers. I want to have a flag to say DB for NEW members is now database X and that members X-Y live on database N etc.
How do I do that? Is it using **hints, this seems inadequately documented to me.
The hints parameter is designed to help your database router decide where it should read or write its data. It may evolve with future versions of python, but for now there's just one kind of hint that may be given by the Django framework, and that's the instance it's working on.
I wrote this very simple database router to see what Django does:
# routers.py
import logging
logger = logging.getLogger("my_project")
class DebugRouter(object):
"""A debugging router"""
def db_for_read(self, model, **hints):
logger.debug("db_for_read %s" % repr((model, hints)))
return None
def db_for_write(self, model, **hints):
logger.debug("db_for_write %s" % repr((model, hints)))
return None
def allow_relation(self, obj1, obj2, **hints):
logger.debug("allow_relation %s" % repr((obj1, obj2, hints)))
return None
def allow_syncdb(self, db, model):
logger.debug("allow_syncdb %s" % repr((db, model)))
return None
You declare this in settings.py:
DATABASE_ROUTERS = ["my_project.routers.DebugRouter"]
Make sure logging is properly configured to output debug output (for example to stderr):
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'handlers': {
[...some other handlers...]
'stderr': {
'level': 'DEBUG',
'class': 'logging.StreamHandler'
}
},
'loggers': {
[...some other loggers...]
'my_project': {
'handlers': ['stderr'],
'level': 'DEBUG',
'propagate': True,
},
}
}
Then you can open a Django shell and test a few requests to see what data your router is being given:
$ ./manage.py shell
[...]
>>> from my_project.my_app.models import User
>>> User.objects.get(pk = 1234)
db_for_read (<class 'my_project.my_app.models.User'>, {})
<User: User object>
>>> user = User.objects.create(name = "Arthur", title = "King")
db_for_write (<class 'my_project.my_app.models.User'>, {})
>>> user.name = "Kong"
>>> user.save()
db_for_write (<class 'my_project.my_app.models.User'>, {'instance':
<User: User object>})
>>>
As you can see, the hints is always empty when no instance is available (in memory) yet. So you cannot use routers if you need query parameters (the object's id for example) in order to determine which database to query. It might be possible in the future if Django provides the query or queryset objects in the hints dict.
So to answer your question, I would say that for now you must create a custom Manager, as suggested by Aaron Merriam. But overriding just the create method is not enough, since you also need to be able to fetch an object in the appropriate database. Something like this might work (not tested yet):
class CustomManager(models.Manager)
def self.find_database_alias(self, pk):
return #... implement the logic to determine the shard from the pk
def self.new_object_database_alias(self):
return #... database alias for a new object
def get(self, *args, **kargs):
pk = kargs.get("pk")
if pk is None:
raise Exception("Sharded table: you must provide the primary key")
db_alias = self.find_database_alias(pk)
qs = self.get_query_set().using(db_alias)
return qs.get(*args, **kargs)
def create(self, *args, **kwargs):
db_alias = self.new_object_database_alias()
qs = super(CustomManager, self).using(db_alias)
return qs.create(*args, **kwargs)
class ModelA(models.Model):
objects = CustomManager()
Cheers
using should allow you to designate which database you want to use.
subclassing the create method might accomplish what you're looking to do.
class CustomManager(models.Manager)
def get_query_set(self):
return super(CustomManager, self).get_query_set()
def create(self, *args, **kwargs):
return super(CustomManager, self).using('OTHER_DB').create(*args, **kwargs)
class ModelA(models.Model):
objects = CustomManager()
I have not tested this so I don't know if you can tack a 'create' onto the end of a 'using'