Store numpy array in session (Alternatives) - django

I have a class in a view that does some calculations.
The instance of this class is passed to a Django template context and the results are displayed in the html.
On the other hand, I need to store these results somewhere for later use in another view, where a pdf document is generated using these results.
The results are large lists of data.
For example:
def view_one(request):
# some code
class_instance = SomeClass(form.cleaned_data)
html = render_to_string('results.html', {'instance': class_instance})
return JsonResponse({"result": html})
def view_two(request):
# some code
class_pdf = GeneratePdf(class_instance) # How can I pass the class_instance data?
# some code
The results are large lists of data.
Should I use Django request session to store the data?
Can I use Celery?
There is another alternative?

Related

Wagtail add functions to models.py

i'm trying to make a custom plotly-graphic on a wagtail homepage.
I got this far. I'm overriding the wagtail Page-model by altering the context returned to the template. Am i doing this the right way, is this possible in models.py ?
Thnx in advanced.
from django.db import models
from wagtail.models import Page
from wagtail.fields import RichTextField
from wagtail.admin.panels import FieldPanel
import psycopg2
from psycopg2 import sql
import pandas as pd
import plotly.graph_objs as go
from plotly.offline import plot
class CasPage(Page):
body = RichTextField(blank=True)
content_panels = Page.content_panels + [
FieldPanel('body'),
]
def get_connection(self):
try:
return psycopg2.connect(
database="xxxx",
user="xxxx",
password="xxxx",
host="xxxxxxxxxxxxx",
port=xxxxx,
)
except:
return False
conn = get_connection()
cursor = conn.cursor()
strquery = (f'''SELECT t.datum, t.grwaarde - LAG(t.grwaarde,1) OVER (ORDER BY datum) AS
gebruiktgas
FROM XXX
''')
data = pd.read_sql(strquery, conn)
fig1 = go.Figure(
data = data,
layout=go.Layout(
title="Gas-verbruik",
yaxis_title="aantal M3")
)
output = plotly.plot(fig1, output_type='div', include_plotlyjs=False)
# https://stackoverflow.com/questions/32626815/wagtail-views-extra-context
def get_context(self, request):
context = super(CasPage, self).get_context(request)
context['output'] = output
return context
Kind of the right track. You should move all the plot code into its own method though. At the moment, it runs the plot code when the site initialises then stays stored in memory.
There's three usual ways to get the plot to the rendered page then.
As you've done with context
As a property or method of the page class
As a template tag called from the template
The first two have more or less the same effect, except the 2nd makes the property available anywhere, not just the template. The context method runs before the page starts rendering, the other two happen during that process. I guess the only real difference there is that if you're using template caching, the context will always run each time the page is loaded, the other two only run when the cache is invalid, or if the code is escaped out of the cache (for fragment caching).
To call the plot as a property of your page class, you'd just pull out the code into a def with the #property decorator:
class CasPage(Page):
....
#property
def plot(self):
try:
conn = psycopg2.connect(
database="xxxx",
user="xxxx",
password="xxxx",
host="xxxxxxxxxxxxx",
port=xxxxx,
)
cursor = conn.cursor()
strquery = (f'''SELECT t.datum, t.grwaarde - LAG(t.grwaarde,1) OVER (ORDER BY datum) AS
gebruiktgas FROM XXX''')
data = pd.read_sql(strquery, conn)
fig1 = go.Figure(
data = data,
layout=go.Layout(
title="Gas-verbruik",
yaxis_title="aantal M3")
)
return plotly.plot(fig1, output_type='div', include_plotlyjs=False)
except Exception as e:
print(f"{type(e).__name__} at line {e.__traceback__.tb_lineno} of {__file__}: {e}")
return None
^ I haven't tried this code ... it should work as is, but no guarantees I didn't make a typo ;)
Now you can access your plot with {{ self.plot }} in the template.
If you want to stick with context, then you'd stay with the def above but just amend your output line to
context['output'] = self.plot
Template tags are more useful when they're being used in StructBlocks and not part of a page class like this, or where you have code that you want to re-use in multiple templates.
Then you'd move all that plot code into a template tag file, register it and call it in the template with {% plot %}. Wagtail template tags work the same as Django: https://docs.djangoproject.com/en/4.1/howto/custom-template-tags/
Is the plot data outside of the site database? If not, you could probably get the data via the ORM if it was defined as a model. If so, it's probably worth writing a view (or stored procedure if you want to pass parameters) on the db server and calling that rather than hard coding the SQL into your python.
The other consideration is the page load time - if the dataset is big, this could take a while and prevent the page from loading. You'd probably want a front-end solution in that case.

Get scrapy result inside a Django view

I'm scrapping a page successfully that returns me an unique item. I don't want neither to save the scrapped item in the database nor to a file. I need to get it inside a Django view.
My view is as follows:
def start_crawl(process_number, court):
"""
Starts the crawler.
Args:
process_number (str): Process number to be found.
court (str): Court of the process.
"""
runner = CrawlerRunner(get_project_settings())
results = list()
def crawler_results(sender, parse_result, **kwargs):
results.append(parse_result)
dispatcher.connect(crawler_results, signal=signals.item_passed)
process_info = runner.crawl(MySpider, process_number=process_number, court=court)
return results
I followed this solution but results list is always empty.
I read something as creating a custom middleware and getting the results at the process_spider_output method.
How can I get the desired result?
Thanks!
I managed to implement something like that in one of my projects. It is a mini-project and I was looking for a quick solution. You'll might need modify it or support multi-threading etc in case you put it in production environment.
Overview
I created an ItemPipeline that just add the items into a InMemoryItemStore helper. Then, in my __main__ code I wait for the crawler to finish, and pop all the items out of the InMemoryItemStore. Then I can manipulate the items as I wish.
Code
items_store.py
Hacky in-memory store. It is not very elegant but it got the job done for me. Modify and improve if you wish. I've implemented that as a simple class object so I can simply import it anywhere in the project and use it without passing its instance around.
class InMemoryItemStore(object):
__ITEM_STORE = None
#classmethod
def pop_items(cls):
items = cls.__ITEM_STORE or []
cls.__ITEM_STORE = None
return items
#classmethod
def add_item(cls, item):
if not cls.__ITEM_STORE:
cls.__ITEM_STORE = []
cls.__ITEM_STORE.append(item)
pipelines.py
This pipleline will store the objects in the in-memory store from the snippet above. All items are simply returned to keep the regular pipeline flow intact. If you don't want to pass some items down the to the other pipelines simply change process_item to not return all items.
from <your-project>.items_store import InMemoryItemStore
class StoreInMemoryPipeline(object):
"""Add items to the in-memory item store."""
def process_item(self, item, spider):
InMemoryItemStore.add_item(item)
return item
settings.py
Now add the StoreInMemoryPipeline in the scraper settings. If you change the process_item method above, make sure you set the proper priority here (changing the 100 down here).
ITEM_PIPELINES = {
...
'<your-project-name>.pipelines.StoreInMemoryPipeline': 100,
...
}
main.py
This is where I tie all these things together. I clean the in-memory store, run the crawler, and fetch all the items.
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from <your-project>.items_store import InMemoryItemStore
from <your-project>.spiders.your_spider import YourSpider
def get_crawler_items(**kwargs):
InMemoryItemStore.pop_items()
process = CrawlerProcess(get_project_settings())
process.crawl(YourSpider, **kwargs)
process.start() # the script will block here until the crawling is finished
process.stop()
return InMemoryItemStore.pop_items()
if __name__ == "__main__":
items = get_crawler_items()
If you really want to collect all data in a "special" object.
Store the data in a separate pipeline like https://doc.scrapy.org/en/latest/topics/item-pipeline.html#duplicates-filter and in close_spider (https://doc.scrapy.org/en/latest/topics/item-pipeline.html?highlight=close_spider#close_spider) you open your django object.

Django/Python convert string to Model filter with '=' in result

I'm working on writing test_templates so that I can very quickly write my tests, as I realized I was duplicating the same code with different variables. But I've run into a problem:
# path of view
# '/app/view/path/'
view_name = 'service:create_employee_profile'
# valid field values to test form success.
valid_values = {
'first_name': 'First',
'last_name': 'Last',
}
# Search criteria for Model 'get' and 'filter'
# Model.objects.get(field=value)
# Model.objects.get(eval(model_criteria))
model_criteria = 'first_name="First"'
"""
TESTS: Submitting forms
"""
# TEST: View saves valid object.
def test_view_saves_valid_object(self):
response = self.client.post(
reverse(view_name), valid_values)
self.assertTrue(Model.objects.filter(eval(model_criteria)).exists())
I thought I was set with eval(), until I quickly discovered that it doesn't like =. I tried using 2 different variables for 'first_name="First"', but a Model will never find a field out of a variable='field_name'.
These templates help me test multiple views with adding just a little information to them, and since more than 1 test in the template requires retrieving an instance of the model I am trying to set a variable at the top that will run all associated tests.
You can use a dictionary instead:
model_criteria = {'first_name': "First"}
Just unpack it when you pass it as filter() argument using **:
self.assertTrue(Model.objects.filter(**model_criteria).exists())

How to bundle many forms in single instance in django

Main question is whether there is any method to bundle many django forms into single instance, to make clear what I need to explain my problem:
I have created bunch of form classes that need to work together to display a single view.
from_form = move_forms.WaypointForm(prefix="marker-from", instance=move.from_place)
to_form = move_forms.WaypointForm(prefix="marker-to", instance=move.to_place)
#Notice that last two form are of the same class
through = ThroughFormset(prefix="through", queryset=move.waypoints_db.all())
path_form = move_forms.CarMovePathForm(path=move.path)
date_form = move_forms.MoveForm(instance=move)
#put all this into context instance and render
All these forms basically display/edit the same instance of database --- but are reused throught the app in different configurations (so I cant just manually create class that will encapsulate it).
Having many forms in a webpage is a nuisance, for example I have to write code like that:
if transportation_form.is_valid() and from_form.is_valid() and \
to_form.is_valid() and through.is_valid() and path_form.is_valid():
I can't pass cleanly form property to views since most vievs use many forms in such manner.
Is there any sensible way to bundle these forms --- or is just my design broken (if so how to fix it).
What about this
from_form = move_forms.WaypointForm(
request.POST or None,
prefix="marker-from",
instance=move.from_place)
# ... other forms declared the same way
forms = {
'from_form': from_form,
'to_form': to_form,
# ...
}
if all(f.is_valid() for f in forms.values()):
# ...
return redirect('success')
return render(request, 'template.html', {'forms': forms})

Storing user's avatar upon registration

I have an extended UserProfile for registering new users. My user_created function connects to signals sent upon registering basic User instance and creates new UserProfile with extended fields from my form. Here's the code :
from registration.signals import user_registered
from accounts.forms import ExtendedRegistrationForm
import accounts
from accounts.models import UserProfile
def user_created(sender, user, request, **kwargs):
form = ExtendedRegistrationForm(request.POST, request.FILES)
data = UserProfile(user=user)
data.is_active = False
data.first_name = form.data['first_name']
data.last_name = form.data['last_name']
data.pid = form.data['pid']
data.image = form.data['image']
data.street = form.data['street']
data.number = form.data['number']
data.code = form.data['code']
data.city = form.data['city']
data.save()
user_registered.connect(user_created)
Problem is that on this form I have an image field for avatar. As you can see from the code, I'm getting data from form's data list. But apparently imageField does not send it's data with POST request(as I'm getting MultiValueDictKeyError at /user/register/, Key 'image' not found in <QueryDict...) so I can't get it from data[] .
alt text http://img38.imageshack.us/img38/3839/61289917.png
If the usual variables are inside 'data', where should I look for files ? Or is the problem more complicated ? Strange thing is that my form doesn't have attribute cleaned_data... I was using dmitko's method here : http://dmitko.ru/?p=546&lang=en . My :
forms : http://paste.pocoo.org/show/230754/
models : http://paste.pocoo.org/show/230755/
You should be validating the form before using it, which will create the "cleaned_data" attribute you're used to. Just check form.is_valid() and the "cleaned_data" attribute will be available, and should contain the file.
The form's "data" attribute is going to be whatever you passed in as its first initalization argument (in this case, request.POST), and files are stored separately in the "files" attribute (whatever you pass in as the second argument, in this case, request.FILES). You don't want to be accessing the form's "data" or "files" attributes directly, as, if you do, you're just reading data straight from the request and not getting any benefit from using forms.
Are you sure the <form enctype="..."> attribute is set to multipart/form-data ? Otherwise the browser is not able to upload the file data.