Integrate Dataframe from a Python File to Django - django

I've developed a complex data analysis model using Python (for the sake of simplicity, let's call it analysis.py) which contains multiple long algorithms to transform a set of raw data and spits out output in the form of dataframe as well.
I've been looking into some tutorials recently on Django framework and have tried building a simple web app that resembles a forum/blog post web app.
With the dataframe output from the analysis.py, I would like to display the data in the form of multiple charts on the web app.
Can anyone point me to a direction to move from here? I've looked for multiple resources online but I think I am unable to find the correct keywords that match with what I actually want to build. I just want a shell code kind of a thing to integrate into my current django backend in order to be able to call the dataframe from the analysis.py
Thank you very much in advance.

It amazes me somehow that after posting a question here, I managed to come up with better keywords in finding the solution that is closely matched to my intended product.
Again, I would like to apologize if my question was too vague or did not contain any script that I've done so far, since I was merely looking for direction to move forward. And fortunately I did after giving it much thought and integrating different solutions from multiple sources on the net.
I'm not sure if my solution is the best solution, but it works for me and if it at least helps other people who were on the same boat as I did before, I'd be glad!
So my solution entails importing a dataframe from the analysis.py and then pass the data that I want as API Endpoints (using Django Rest Framework) in order to be displayed in my dashboard.
The following are some details (with some scripts) of my solution:
views.py
# Importing the external python script (analysis.py).
#This is where the data from csv is imported and is transformed.
#The script is written in a function form and returns the dataframe.
from . import analysis
from .analysis import data_transform
#Django Rest Framework
from rest_framework.views import APIView
from rest_framework.response import Response
from django.http import HttpResponse, JsonResponse
from django.shortcuts import render, redirect
class ChartData(APIView):
authentication_classes = []
permission_classes =[]
def get(self, request, format=None):
# Assign dataframe output to a new dataframe
df = data_transform()
# Assigning variables from datafram
data1 = df['VOL'].values.tolist()
data2 = df['PRICE'].values.tolist()
data = {
"data1":data1,
"data2":data2,
}
#Return endpoints in Json format
return Response(data)
Analysis.py looks more like this
import pandas as pd
import numpy as np
def data_transform()
#where all the transformations happened
return dataframe
And then I passed the endpoints ('data') to a html in order to visualize them using chart.js. You can find some good resources in the Tube. Look up keywords chart.js and django.
Again, I'm not sure this is the best method. I am open to a better solution, if you guys have any!

Related

How to customize API end points on redoc API documentation

I am using redoc in django==2.0 for documenting some django API. I noticed that by default redoc will name the endpoints automatically as you can see on the image below on the left side. Most likely I don't want to use the names generated I want customize the names. From someone who has experience with redoc documentation could you advice please?
If you are using drf-yasg, you can use the swagger_auto_schema decorator to configure the operation_id.
from drf_yasg.utils import swagger_auto_schema
from django.utils.decorators import method_decorator
#method_decorator(name='get', decorator=swagger_auto_schema(operation_id='List Widgets', operation_description='List all available widgets'))
class WidgetListView(ListAPIView):
serializer_class = WidgetSerializer
def get_queryset(self):
return Widget.objects.all()
These summaries are actually populated from the input JSON, which can be found at this path in source ["paths"][path][method]["summary"].
You might want to edit these in order to change the summaries.
If you don't want to change the source input a hack around can be to change the text of DOM elements after REDOC load.

Same hypothesis test for different django models

I want to use hypothesis to test a tool we've written to create avro schema from Django models. Writing tests for a single model is simple enough using the django extra:
from avro.io import AvroTypeException
from hypothesis import given
from hypothesis.extra.django.models import models as hypothetical
from my_code import models
#given(hypothetical(models.Foo))
def test_amodel_schema(self, amodel):
"""Test a model through avro_utils.AvroSchema"""
# Get the already-created schema for the current model:
schema = (s for m, s in SCHEMA if m == amodel.model_name)
for schemata in schema:
error = None
try:
schemata.add_django_object(amodel)
except AvroTypeException as error:
pass
assert error is None
...but if I were to write tests for every model that can be avro-schema-ified they would be exactly the same except for the argument to the given decorator. I can get all the models I'm interested in testing with ContentTypeCache.list_models() that returns a dictionary of schema_name: model (yes, I know, it's not a list). But how can I generate code like
for schema_name, model in ContentTypeCache.list_models().items():
#given(hypothetical(model))
def test_this_schema(self, amodel):
# Same logic as above
I've considered basically dynamically generating each test method and directly attaching it to globals, but that sounds awfully hard to understand later. How can I write the same basic parameter tests for different django models with the least confusing dynamic programming possible?
You could write it as a single test using one_of:
import hypothesis.strategies as st
#given(one_of([hypothetical(model) for model in ContentTypeCache.list_models().values()]))
def test_this_schema(self, amodel):
# Same logic as above
You might want to up the number of tests run in this case using something like #settings(max_examples=settings.default.max_examples * len(ContentTypeCache.list_models())) so that it runs the same number of examples as N tests.
I would usually solve this kind of problem by parametrising the test, and drawing from the strategy internally:
#pytest.mark.parametrize('model_type', list(ContentTypeCache.list_models().values()))
#given(data=st.data())
def test_amodel_schema(self, model_type, data):
amodel = data.draw(hypothetical(model_type))
...

reverse django url to object, not view. possible?

I have a set of URLs for which I would like to retrieve the django model associated with this url, not the django view which is what the reverse URL Dispatcher does. The code would ideally look something like this:
urls_to_lookup = get_urls_to_lookup()
models = []
for url in urls_to_lookup:
model = retrieve_django_model(url)
models.append(model)
Since the urls I would like to lookup have unique models associated with them (via the #permalink decorator), it seems like this is possible but my google skillz are coming up empty handed. Thanks for your help!
EDIT In case it helps brainstorming solutions, I'm pulling these URLs from Google Analytics for all blog posts and I want to dynamically display most frequently viewed pages. The URL itself is helpful, but I would like to grab the title, teaser, etc for each blog post for display and that is all stored in the database.
If you are trying to create a sitemap; there's the sitemaps contrib app.
If you are trying to print out all the URLs in a nice format, see this answer.
I'm trying to think of a reason for having such a feature, but it escapes me. However, this should do what you want (not tested):
from django.db import models
def retrieve_django_model(url):
m_instances = [m for m in models.get_models() \
if m.objects.all().count()]
for m in m_instances:
if m.objects.all().order_by('?')[0].get_absolute_url() == url:
return m
else:
return None
Since we can only fetch the absolute url from instances not models, the initial list comprehension filters out those models for which there are no instances, and hence we cannot get the absolute URL.

Django JSONField dumping/loading

I'm using JSONField in some of my Django models and would like to migrate this data from Oracle to Postgres.
So far I haven't had any luck keeping this JSON data intact when using Django's dumpdata and loaddata commands, the data is transformed into string representations of the JSON. I've yet to find a good solution to this... Ideas?
I ended up solving this problem by overriding Django's included JSON serializer, specifically the handle_field method, in a custom serializer file called custom_json_serializer.py. By doing this I can ensure that specific JSONFields stay as is, without being converted to string.
On the chance anyone else runs into this issue, these are the steps I took. I had to add this custom serializer to the settings.py file:
SERIALIZATION_MODULES = {
'custom_json': 'myapp.utils.custom_json_serializer',
}
and then call it when serializing the data from Django:
python manage.py dumpdata mymodel --format=custom_json --indent=2 --traceback > mymodel_data.json
The custom serializer looks like:
from django.core.serializers.json import Serializer as JSONSerializer
from django.utils.encoding import is_protected_type
# JSONFields that are normally incorrectly serialized as strings
json_fields = ['problem_field1', 'problem_field2']
class Serializer(JSONSerializer):
"""
A fix on JSONSerializer in order to prevent stringifying JSONField data.
"""
def handle_field(self, obj, field):
value = field._get_val_from_obj(obj)
# Protected types (i.e., primitives like None, numbers, dates,
# and Decimals) are passed through as is. All other values are
# converted to string first.
if is_protected_type(value) or field.name in json_fields:
self._current[field.name] = value
else:
self._current[field.name] = field.value_to_string(obj)
The really strange part is that before this fix some JSONFields were serializing just fine, while others were not. That is why I took the approach of specifying the fields to be handled. Now all data is serializing correctly.
I haven't used the JSONField before, but what I do is:
import json
data_structure = json.loads(myData)
Maybe that will work for what you need as well. There's likely a better way to deal with this.
EDIT: If you end up using the package json - only then is the following solution applicable.
If you are using Python 2.6 and above you can use:
import json
otherwise, you can use the simplejson that is bundled with django.utils (for Python < 2.6).
from django.utils import simplejson as json
That way you can continue to use the same package name, and take your code to Google App Engine as it supports Python 2.5.2 at the moment.

Multiple Databases in Django 1.0.2 with custom manager

I asked this in the users group with no response so i thought I would try here.
I am trying to setup a custom manager to connect to another database
on the same server as my default mysql connection. I have tried
following the examples here and here but have had no luck. I get an empty tuple when returning
MyCustomModel.objects.all().
Here is what I have in manager.py
from django.db import models
from django.db.backends.mysql.base import DatabaseWrapper
from django.conf import settings
class CustomManager(models.Manager):
"""
This Manager lets you set the DATABASE_NAME on a per-model basis.
"""
def __init__(self, database_name, *args, **kwargs):
models.Manager.__init__(self, *args, **kwargs)
self.database_name = database_name
def get_query_set(self):
qs = models.Manager.get_query_set(self)
qs.query.connection = self.get_db_wrapper()
return qs
def get_db_wrapper(self):
# Monkeypatch the settings file. This is not thread-safe!
old_db_name = settings.DATABASE_NAME
settings.DATABASE_NAME = self.database_name
wrapper = DatabaseWrapper()
wrapper._cursor(settings)
settings.DATABASE_NAME = old_db_name
return wrapper
and here is what I have in models.py:
from django.db import models
from myproject.myapp.manager import CustomManager
class MyCustomModel(models.Model):
field1 = models.CharField(max_length=765)
attribute = models.CharField(max_length=765)
objects = CustomManager('custom_database_name')
class Meta:
abstract = True
But if I run MyCustomModel.objects.all() I get an empty list.
I am pretty new at this stuff so I am not sure if this works with
1.0.2, I am going to look into the Manager code to see if I can figure
it out but I am just wondering if I am doing something wrong here.
UPDATE:
This now in Django trunk and will be part of the 1.2 release
http://docs.djangoproject.com/en/dev/topics/db/multi-db/
You may want to speak to Alex Gaynor as he is adding MultiDB support and its pegged for possible release in Django 1.2. I'm sure he would appreciate feedback and input from those that are going to be using MultiDB. There is discussions about it in the django-developers mainling list. His MultiDB branch may even be useable, I'm not sure.
Since I guess you probably can't wait and if the MultiDB branch isn't usable, here are your options.
Follow Eric Flows method, bearing in mind that its not supported and new released of Django may break it. Also, some comments suggest its already been broken. This is going to be hacky.
Your other option would be to use a totally different database access method for one of your databases. Perhaps SQLAlchemy for one and then Django ORM. I'm going by the guess that one is likely to be more Django centric and the other is a legacy database.
To summarise. I think hacking MultiDB into Django is probably the wrong way to go unless your prepared to keep up with maintaining your hacks later on. Therefore I think another ORM or database access would give you the cleanest route as then you are not going out with supported features and at the end of the day, its all just Python.
My company has had success using multiple databases by closely following this blog post: http://www.eflorenzano.com/blog/post/easy-multi-database-support-django/
This probably isnt the answer your looking for, but its probably best if you move everything you need into the one database.