I have the following management command (website.py)
from __future__ import absolute_import
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def run_from_argv(self, argv):
self._argv = argv
self.execute()
def handle(self, *args, **options):
from scrapy.cmdline import execute
execute(self._argv[1:])
I'd like to execute this command via URL: /crawl/update-now/
The view is:
from django.core import management
def update_index(request):
management.call_command('website', 'crawl spider')
But it's not working:
Command' object has no attribute '_argv'
I think problem is that run_from_argv is internal Django method and called by django.core.management.ManagementUtility. And you should not implement it by yourself, self._argv is not set anywhere. Arguments are already available in handle().
And your approach has some disadvantages.
First of all, due to synchronous nature of Django, if your URL is "heavy" it can take a lot of time to get it and parse. Instead of it, I'm strongly recommend you to take a look at Celery. It is more "right" way to execute tasks from views and has no problems with performance.
Related
I want to use scrapy spider in Django views and I tried using CrawlRunner and CrawlProcess but there are problems, views are synced and further crawler does not return a response directly
I tried a few ways:
# Core imports.
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
# Third-party imports.
from rest_framework.views import APIView
from rest_framework.response import Response
# Local imports.
from scrapy_project.spiders.google import GoogleSpider
class ForFunAPIView(APIView):
def get(self, *args, **kwargs):
process = CrawlerProcess(get_project_settings())
process.crawl(GoogleSpider)
process.start()
return Response('ok')
is there any solution to handle that and run spider directly in other scripts or projects without using DjangoItem pipeline?
you didn't really specify what the problems are, however, I guess the problem is that you need to return the Response immediately, and leave the heavy call aka function to run in the background, you can alter your code as following, to use the Threading module
from threading import Thread
class ForFunAPIView(APIView):
def get(self, *args, **kwargs):
process = CrawlerProcess(get_project_settings())
process.crawl(GoogleSpider)
thread = Thread(target=process.start)
thread.start()
return Response('ok')
after a while of searching for this topic, I found a good explanation here: Building a RESTful Flask API for Scrapy
In my project, I have a main static folder and a sub folder named static. When I make changes in my sub folder named static (which I specified in COLLECTSTATIC_DIRS within the settings file), I save the file and run the collectstatic command.
This successfully saves the changes, however is really inefficient as I am constantly changing css and Javascript files inside my project, which I store as static files.
I browsed the web, and came across a solution named whitenoise, which is a pip package. But this package only works for a short period of time, and after a few times of closing and opening my project folder, it completely stopped working.
Does anybody have another solution to deal with this problem? Thank you.
You can use python-watchdog and write your own Django command:
import time
from django.conf import settings
from django.core.management import call_command
from django.core.management.base import BaseCommand
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
class Command(BaseCommand):
help = "Automatically calls collectstatic when the staticfiles get modified."
def handle(self, *args, **options):
event_handler = CollectstaticEventHandler()
observer = Observer()
for path in settings.STATICFILES_DIRS:
observer.schedule(event_handler, path, recursive=True)
observer.start()
try:
while True:
time.sleep(1)
finally:
observer.stop()
observer.join()
class CollectstaticEventHandler(FileSystemEventHandler):
def on_moved(self, event):
super().on_moved(event)
self._collectstatic()
def on_created(self, event):
super().on_created(event)
self._collectstatic()
def on_deleted(self, event):
super().on_deleted(event)
self._collectstatic()
def on_modified(self, event):
super().on_modified(event)
self._collectstatic()
def _collectstatic(self):
call_command("collectstatic", interactive=False)
You can use 3rd party solution which doesn't belong to Django to monitor your files and run commands on the files changes.
Take a look at fswatch utility Bash Script - fswatch trigger bash function
How to manage static files "Django-way"
Please check details on https://docs.djangoproject.com/en/3.0/howto/static-files/ what is a Django-way to do it correct :)
First of all set DEBUG = True while working on development.
Then add these lines to your project's urls.py:
from django.conf import settings
from django.views.decorators.cache import cache_control
from django.contrib.staticfiles.views import serve
from django.conf.urls.static import static
if settings.DEBUG:
urlpatterns += static(settings.STATIC_URL,
view=cache_control(no_cache=True, must_revalidate=True)(serve))
Here's an example using Python watchfiles and a Django management command.
# Lives in /my_app/manangement/commands/watch_files.py
import time
from django.conf import settings
from django.core.management import call_command
from django.core.management.base import BaseCommand
from watchfiles import watch
class Command(BaseCommand):
help = "Automatically calls collectstatic when the staticfiles get modified."
def handle(self, *args, **options):
print('WATCH_STATIC: Static file watchdog started.')
#for changes in watch([str(x) for x in settings.STATICFILES_DIRS]):
for changes in watch(*settings.STATICFILES_DIRS):
print(f'WATCH_STATIC: {changes}', end='')
call_command("collectstatic", interactive=False)
You can then run the Django management command in the background in whatever script you use to start Django.
python manage.py watch_static &
python manage.py runserver 0.0.0.0:8000
I'm aware of pdb built-in Python library for debugging Python programs and scripts. However, you can't really use it for debugging Django apps (I get errors when I try to use it in views.py). Are there any tools that I can use when Django's traceback isn't helpful ?
EDIT:
from .forms import TestCaseForm, TestCaseSuiteForm
from .models import TestCase, TestSuite
from django.contrib.auth.forms import UserCreationForm
from django.views.generic import FormView, ListView
from django.contrib.auth.models import User
from django.utils.decorators import method_decorator
from django.contrib.auth.decorators import login_required
from django.contrib.auth import logout
import pdb
class ListAllTestSuites(ListView):
template_name = 'list.html'
context_object_name = 'username'
def get_queryset(self):
pdb.set_trace() # <-- setting a trace here to diagnose the code below
queryset = {'test_suites': TestSuite.objects.filter(user=self.request.user),
'username': self.request.user}
return queryset
you forgot the exact error message and full traceback, and, more importantly, you forgot to explain how you executed this code to get this result...
But anyway: from the error message, you're obviously trying to execute your view file as a plain python script (cf the reference to __main__). This cannot work. A view is a module, not a script, and, moreover, any module dependending on Django needs some setup done to be imported (which is why we use the django shell - ./manage.py shell - instead of the regular Python one).
For most case, you can just launch the django shell, import your module and use pdb.runcall() to execute some function / method under the debugger (no need to put a breakpoint then, but that's still possible).
Now views require a HTTPRequest as first argument which make them a bit more cumbersome to debug that way (well, there is django.tests.RequestFactory but still...), so your best bet here is to set your breakpoint, launch your devserver (or restart it - if it didn't already did, as it should), point your browser to the view's url, and then you should see the debugger's prompt in your devserver's terminal.
I'll keep this as vague as possible - as it's quite a broad question.
I'm building a payment system within my Django project - and it would be amazing to be able to run my project over a secure server connections. And now were moving into a more forced secure internet with more emphasis on site security with in browser alerts etc I think this is something that needs to be added to the Django core management commands.
I've started to build this functionality inside an application:
management/commands/runsecureserver.py:
import os
import ssl
import sys
from django.core.servers.basehttp import WSGIServer
class SecureHTTPServer(WSGIServer):
def __init__(self, address, handler_cls, certificate, key):
super(SecureHTTPServer, self).__init__(address, handler_cls)
self.socket = ssl.wrap_socket(self.socket, certfile=certificate,
keyfile=key, server_side=True,
ssl_version=ssl.PROTOCOL_TLSv1_2,
cert_reqs=ssl.CERT_NONE)
I'm now wondering - I have the following class that extends from the BaseCommand class from Django's runserver.py where I add my arguments for specifying cert files etc, an inner_run() function which will mimic a lot of the Django runserver inner_run() with added certificate checks, port configurations etc.
class Command(BaseCommand):
def add_arguments(self, parser):
super(Command, self).add_arguments(parser)
parser.add_argument(**some argument here***)
However, when running:
$ python manage.py runsecureserver
I receive the following error:
NotImplementedError: subclasses of BaseCommand must provide a handle() method
So, it's telling me I need a handle() method...
Q. What is a handle() method in this context, and what should it do?
Q. Is it enough to simply use the existing handle() method from Django's runserver.py?
https://docs.djangoproject.com/en/3.2/howto/custom-management-commands/
Your file must be within /DjangoProject/YouApp/management/commands/
# custom_command.py
class Command(BaseCommand):
def add_arguments(self, parser):
parser.add_argument('--delta')
parser.add_argument('--alpha')
parser.add_argument('--universe')
def handle(self, *args, **options):
# argument in a dict here :
print(options)
# Do your stuff :)
pass
python manage.py custom_command --universe=42
I am developing my first django website.
I have written code in my view layer (the handlers that return an HttpResponse object to the view template (hope I am using the correct terminology).
In any case, I want to put print statements in my views.py file, so that I can debug it. However, it looks like stdout has been redirect to another stream, so I am not seeing anything printed out on my console (or even the browser).
What is the recommended way (best practice) for debugging django view layer scripts?
there are more advanced ways of doing it, but i find dropping
import pdb
pdb.set_trace()
does the job.
Use the Python logging module. Then use the Django debug toolbar, which will catch and display all the things you send to the log.
I'd upvote dysmsyd, but I don't have the reputation.
pdb is good because it lets you step thru your procedure and follow the control flow.
If you are using the django runserver, you can print to stdout or stderr.
If you are using the mod_wsgi, you can print to stderr.
The pprint module is also useful:
import sys
from pprint import pprint
def myview(request):
pprint (request, sys.stderr)
Try django-sentry. Especially if your project is in production stage.
Configure django debug toolbar: pip install django-debug-toolbar and follow the instructions to configure it in: https://github.com/django-debug-toolbar/django-debug-toolbar
import logging
Use the logging to debug: logging.debug('My DEBUG message')
Here is how it works on my class view:
from django.views.generic import TemplateView
import logging
class ProfileView(TemplateView):
template_name = 'profile.html'
def get(self, request, *args, **kwargs):
logging.debug(kwargs)
return render(request, self.template_name)