Wagtail add functions to models.py - django

i'm trying to make a custom plotly-graphic on a wagtail homepage.
I got this far. I'm overriding the wagtail Page-model by altering the context returned to the template. Am i doing this the right way, is this possible in models.py ?
Thnx in advanced.
from django.db import models
from wagtail.models import Page
from wagtail.fields import RichTextField
from wagtail.admin.panels import FieldPanel
import psycopg2
from psycopg2 import sql
import pandas as pd
import plotly.graph_objs as go
from plotly.offline import plot
class CasPage(Page):
body = RichTextField(blank=True)
content_panels = Page.content_panels + [
FieldPanel('body'),
]
def get_connection(self):
try:
return psycopg2.connect(
database="xxxx",
user="xxxx",
password="xxxx",
host="xxxxxxxxxxxxx",
port=xxxxx,
)
except:
return False
conn = get_connection()
cursor = conn.cursor()
strquery = (f'''SELECT t.datum, t.grwaarde - LAG(t.grwaarde,1) OVER (ORDER BY datum) AS
gebruiktgas
FROM XXX
''')
data = pd.read_sql(strquery, conn)
fig1 = go.Figure(
data = data,
layout=go.Layout(
title="Gas-verbruik",
yaxis_title="aantal M3")
)
output = plotly.plot(fig1, output_type='div', include_plotlyjs=False)
# https://stackoverflow.com/questions/32626815/wagtail-views-extra-context
def get_context(self, request):
context = super(CasPage, self).get_context(request)
context['output'] = output
return context

Kind of the right track. You should move all the plot code into its own method though. At the moment, it runs the plot code when the site initialises then stays stored in memory.
There's three usual ways to get the plot to the rendered page then.
As you've done with context
As a property or method of the page class
As a template tag called from the template
The first two have more or less the same effect, except the 2nd makes the property available anywhere, not just the template. The context method runs before the page starts rendering, the other two happen during that process. I guess the only real difference there is that if you're using template caching, the context will always run each time the page is loaded, the other two only run when the cache is invalid, or if the code is escaped out of the cache (for fragment caching).
To call the plot as a property of your page class, you'd just pull out the code into a def with the #property decorator:
class CasPage(Page):
....
#property
def plot(self):
try:
conn = psycopg2.connect(
database="xxxx",
user="xxxx",
password="xxxx",
host="xxxxxxxxxxxxx",
port=xxxxx,
)
cursor = conn.cursor()
strquery = (f'''SELECT t.datum, t.grwaarde - LAG(t.grwaarde,1) OVER (ORDER BY datum) AS
gebruiktgas FROM XXX''')
data = pd.read_sql(strquery, conn)
fig1 = go.Figure(
data = data,
layout=go.Layout(
title="Gas-verbruik",
yaxis_title="aantal M3")
)
return plotly.plot(fig1, output_type='div', include_plotlyjs=False)
except Exception as e:
print(f"{type(e).__name__} at line {e.__traceback__.tb_lineno} of {__file__}: {e}")
return None
^ I haven't tried this code ... it should work as is, but no guarantees I didn't make a typo ;)
Now you can access your plot with {{ self.plot }} in the template.
If you want to stick with context, then you'd stay with the def above but just amend your output line to
context['output'] = self.plot
Template tags are more useful when they're being used in StructBlocks and not part of a page class like this, or where you have code that you want to re-use in multiple templates.
Then you'd move all that plot code into a template tag file, register it and call it in the template with {% plot %}. Wagtail template tags work the same as Django: https://docs.djangoproject.com/en/4.1/howto/custom-template-tags/
Is the plot data outside of the site database? If not, you could probably get the data via the ORM if it was defined as a model. If so, it's probably worth writing a view (or stored procedure if you want to pass parameters) on the db server and calling that rather than hard coding the SQL into your python.
The other consideration is the page load time - if the dataset is big, this could take a while and prevent the page from loading. You'd probably want a front-end solution in that case.

Related

Django function for views takes too long

I'm currently using a Docker & Django setup. I have to fill a database with data from API requests. I was hoping to do this everytime you went on a certain page (pretty easy: just have your views.py call the function that fills the database and voila).
But the problem is, the function takes a long time, several minutes from within django (and about half the time with Spyder).
So I usually just get a TimeOut and the page never loads (I admit I have a lot of API requests being made).
I've read some stuff on using Celery but am not quite sure how it's supposed to work.
Anyone know how I could get around this to be able to load the database?
Edit: some code
Views.py
def index(request):
fill_db()
context = {}
context['segment'] = 'index'
html_template = loader.get_template( 'index.html' )
return HttpResponse(html_template.render(context, request))
fill_db function
def fill_db():
fill_agencies()
fill_companies()
fill_contracts()
fill_orders()
fill_projects()
fill_resources()
Example of a fill function:
r = pip._vendor.requests.get(BASE_URL+EXTENSION,auth=(USER,PASS))
data0 = json.loads(r.text)
conn = sqlite3.connect('/app/database.sqlite3')
c = conn.cursor()
for client in data0['data']:
BoondID = client['id']
name = client['attributes']['name']
expertiseArea = client['attributes']['expertiseArea']
town = client['attributes']['town']
country = client['attributes']['country']
mainManager = client['relationships']['mainManager']['data']['id']
values = (BoondID, name, expertiseArea, town, country, mainManager)
c.execute("INSERT OR REPLACE INTO COMPANIES (BoondID,name,expertiseArea,town,country,mainManager) VALUES (?,?,?,?,?,?);", values)
conn.commit()
conn.close()
Solved.
I used python's threading library.
I defined
agencies_thread = threading.Thread(target=fill_agencies, name="Database Updater")
and called agencies_thread.start() inside my views function.
This works fine.

Django Cannot find cursor object in connection object

I am developing a web application using django. I am new Django.
I have call a stored procedure from my application. I gone through the django documentation and i found out that using cursor object i can call the procedure. But i cannot find the cursor object in connection object.
This is how my code looks like :
from django.db import connection
cursor = connection.cursor()
But i cannot find cursor object itself in the connection.
Please help me out where i am going wrong.
I can't see anything wrong with the code you post, I'll assume you don't know how to proceed after you have the cursor, so this is an example:
from django.db import models
from django.db import connection
class Document(models.Model):
# fields
url = models.CharField(max_length=900)
content = models.TextField()
title = models.TextField()
# static method to perform a fulltext search
#staticmethod
def search(search_string):
# create a cursor
cur = connection.cursor()
# execute the stored procedure passing in
# search_string as a parameter
cur.callproc('searcher_document_search', [search_string,])
# grab the results
results = cur.fetchall()
cur.close()
# wrap the results up into Document domain objects
return [Document(*row) for row in results]

Get scrapy result inside a Django view

I'm scrapping a page successfully that returns me an unique item. I don't want neither to save the scrapped item in the database nor to a file. I need to get it inside a Django view.
My view is as follows:
def start_crawl(process_number, court):
"""
Starts the crawler.
Args:
process_number (str): Process number to be found.
court (str): Court of the process.
"""
runner = CrawlerRunner(get_project_settings())
results = list()
def crawler_results(sender, parse_result, **kwargs):
results.append(parse_result)
dispatcher.connect(crawler_results, signal=signals.item_passed)
process_info = runner.crawl(MySpider, process_number=process_number, court=court)
return results
I followed this solution but results list is always empty.
I read something as creating a custom middleware and getting the results at the process_spider_output method.
How can I get the desired result?
Thanks!
I managed to implement something like that in one of my projects. It is a mini-project and I was looking for a quick solution. You'll might need modify it or support multi-threading etc in case you put it in production environment.
Overview
I created an ItemPipeline that just add the items into a InMemoryItemStore helper. Then, in my __main__ code I wait for the crawler to finish, and pop all the items out of the InMemoryItemStore. Then I can manipulate the items as I wish.
Code
items_store.py
Hacky in-memory store. It is not very elegant but it got the job done for me. Modify and improve if you wish. I've implemented that as a simple class object so I can simply import it anywhere in the project and use it without passing its instance around.
class InMemoryItemStore(object):
__ITEM_STORE = None
#classmethod
def pop_items(cls):
items = cls.__ITEM_STORE or []
cls.__ITEM_STORE = None
return items
#classmethod
def add_item(cls, item):
if not cls.__ITEM_STORE:
cls.__ITEM_STORE = []
cls.__ITEM_STORE.append(item)
pipelines.py
This pipleline will store the objects in the in-memory store from the snippet above. All items are simply returned to keep the regular pipeline flow intact. If you don't want to pass some items down the to the other pipelines simply change process_item to not return all items.
from <your-project>.items_store import InMemoryItemStore
class StoreInMemoryPipeline(object):
"""Add items to the in-memory item store."""
def process_item(self, item, spider):
InMemoryItemStore.add_item(item)
return item
settings.py
Now add the StoreInMemoryPipeline in the scraper settings. If you change the process_item method above, make sure you set the proper priority here (changing the 100 down here).
ITEM_PIPELINES = {
...
'<your-project-name>.pipelines.StoreInMemoryPipeline': 100,
...
}
main.py
This is where I tie all these things together. I clean the in-memory store, run the crawler, and fetch all the items.
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from <your-project>.items_store import InMemoryItemStore
from <your-project>.spiders.your_spider import YourSpider
def get_crawler_items(**kwargs):
InMemoryItemStore.pop_items()
process = CrawlerProcess(get_project_settings())
process.crawl(YourSpider, **kwargs)
process.start() # the script will block here until the crawling is finished
process.stop()
return InMemoryItemStore.pop_items()
if __name__ == "__main__":
items = get_crawler_items()
If you really want to collect all data in a "special" object.
Store the data in a separate pipeline like https://doc.scrapy.org/en/latest/topics/item-pipeline.html#duplicates-filter and in close_spider (https://doc.scrapy.org/en/latest/topics/item-pipeline.html?highlight=close_spider#close_spider) you open your django object.

Django Python Appengine

I came across this tutorial:
http://thomas.broxrost.com/2008/04/08/django-on-google-app-engine/
Fantastic!
Everything worked.
I just did not fully understand the code below because in comparison to Django it seems different:
views.py:
def main(request):
visitor = Visitor()
visitor.ip = request.META["REMOTE_ADDR"]
visitor.put()
result = ""
visitors = Visitor.all()
visitors.order("-added_on")
for visitor in visitors.fetch(limit=40):
result += visitor.ip + u" visited on " + unicode(visitor.added_on) + u""
return HttpResponse(result)
#model.py:
from google.appengine.ext import db
class Visitor(db.Model):
ip = db.StringProperty()
added_on = db.DateTimeProperty(auto_now_add=True)
What exactly is Visitor() ? A tuple a list?
And what does visitor.ip , visitor.put(), visitors.fetch() do exactly?
I believe:
visitor.ip saves the request.META["REMOTE_ADDR"] in the db field.
visitor.put() saves it.
visitors.fetch(limit = 40) extracts it from the db.
What I was trying to do is a tenplate that lists every IP below the next one.
I believed:
<p><ol><Li> {{ result }} </li></ol></p>
Would do the trick.
But it didn't.
Thanks !
Visitor is a class, and each field in it represents a column in your database.
When you do visitor = Visitor() you're essentially creating a new row in your database. Calling visitor.put() is what actually commits it into the database. Visitors.all() returns a all the rows in the db (it's either a list, tuple, or dictionary), so then visitors.fetch() is just an operation on that.
The reason your template isn't working is because your function in views.py isn't specifying any template. This is taken from the Django tutorial: http://docs.djangoproject.com/en/1.0/intro/tutorial03/
from django.template import Context, loader
from mysite.polls.models import Poll
from django.http import HttpResponse
def index(request):
latest_poll_list = Poll.objects.all().order_by('-pub_date')[:5]
t = loader.get_template('polls/index.html')
c = Context({
'latest_poll_list': latest_poll_list,
})
return HttpResponse(t.render(c))
The parameter for Context() is a dictionary. The string on the left is what the name of the variable will be within the template, and the right side is what actual variable that it corresponds to. In your example you can use {'mylist': result}, and in your template you could use {{ mylist }} instead of {{ result }}
You also need to make sure to set a template directory in settings.py which the template (in the above example) is polls/index.html within that template dir.
Without knowing anything about the app engine, I'd say this: Visitor() returns an instance of the Visitor class. The step that follows (visitor.ip = request.META["REMOTE_ADDR"]) sets an attribute of the instance created in the first line.

Django templatetag "order of processing"

I am trying to write a set of template tags that allow you to easily specify js and css files from within the template files themselves. Something along the lines of {% requires global.css %}, and later in the request, {% get_required_css %}.
I have this mostly working, but there are a couple of issues. We'll start with the 'timing' issues.
Each template tag is made up of two steps, call/init and render. Every call/init happens before any render procedure is called. In order to guarantee that all of the files are queued before the {% get_required_css %} is rendered, I need to build my list of required files in the call/init procedures themselves.
So, I need to collect all of the files into one bundle per request. The context dict is obviously the place for this, but unfortunately, the call/init doesn't have access to the context variable.
Is this making sense? Anyone see a way around this (without resorting to a hack-y global request object)?
Another possibility to store these in a local dict but they would still need to be tied to the request somehow... possibly some sort of {% start_requires %} tag? But I have no clue how to make that work either.
I've come up with a way to do this which more suits your needs. It will have a bit more load on the server, but proper caching can help to alleviate most of that. Below I've outlined a way that should work if the CSS includes are the same for each path. You'll need to create a single view to include all of these files, but you can actually optimize your CSS using this method, making only a single CSS call for each page.
import md5
class LoadCss(template.Node):
def __init__(self, tag_name, css):
self.css = css
self.tag_name = tag_name
def render(self, context):
request = context['request']
md5key = md5.new(request.path).hexdigest()
if md5key not in request.session:
request.session[md5key] = list()
## This assumes that this method is being called in the correct output order.
request.session[md5key].append(self.css)
return '<!-- Require %s -->' % self.css
def do_load_css(parser, token):
tag_name, css = token.split_contents()
return LoadCss(tag_name, key)
register.tag('requires', do_load_css)
class IncludeCss(template.Node):
def __init__(self, tag_name):
self.tag_name = tag_name
def render(self, context):
request = context['request']
md5key = md5.new(request.path).hexdigest()
return '<link rel="stylesheet" href="/path/to/css/view/%s">' % md5key
def do_include_css(parser, token):
return IncludeCss(token)
register.tag('get_required_css', do_include_css)
views.py:
from django.conf import settings
from django.views.decorators.cache import cache_page
import os
#cache_page(60 * 15) ## 15 Minute cache.
def css_view(request, md5key):
css_requires = request.session.get(md5key, list())
output = list()
for css in css_requires:
fname = os.path.join(settings.MEDIA_ROOT, 'css', css) ## Assumes MEDIA_ROOT/css/ is where the CSS files are.
f = open(fname, 'r')
output.append(f.read())
HttpResponse(''.join(output), mimetype="text/css")
This allows you to store the CSS information in the context, then in the session, and serve the output from a view (with caching to make it faster). This will, of course, have a bit more server overhead.
If you need to vary the CSS on more than just the path, then you can simply modify the md5 lines to suit your needs. You have access to the entire request object, and the context, so almost everything should be in there.
Beware: On second review, this may cause a race condition if the browser fetches the CSS before the session has been populated. I do not believe Django works that way, but I don't feel like looking it up right now.