In settings.py I have setup a way to log my messages like this:
logging.basicConfig(
level = logging.DEBUG,
format = '%(asctime)s %(levelname)s %(message)s',
filename = 'log/sdout.log',
filemode = 'a'
)
However sometimes django fails silently. Like in the case of signals. My problem is that at my work pc everything works great. At my production server something is wrong. I know the place of the exception. I just can't see why it fails.
So I placed something like this:
logging.debug('before error') #this is printed out
try:
end_date = start_date.date() + relativedelta(months = +month_diff)
next_billing_date = start_date.date() + datetime.timedelta(7)
except :
import sys, traceback
traceback.print_exc(file=sys.stdout)
logging.debug('after error') #I never get to that part.
EDIT: The above snippet lies in my models.py called when I receive a POST from paypal
After failure I try to look into my apache error logs and my logging file but see no errors, whatsoever.
How can I see what is wrong with this code?
I would first recommended sentry - if you can't install it due to system restrictions - it is now available as a hosted service at getsentry.com and like most such services, there is a free tier.
However, the issue could be that WSGI by default blocks writing to stdout, as detailed in the documentation wiki under application issues.
There are few workarounds posted at that page, but the one I would start with is to add the following to your WSGI configuration:
WSGIRestrictStdout Off
See if that helps with error logging.
Related
This is a follow-up to this question: How to stop flask application without using ctrl-c . The problem is that I didn't understand some of the terminology in the accepted answer since I'm totally new to this.
import dash
import dash_core_components as dcc
import dash_html_components as html
app = dash.Dash()
app.layout = html.Div(children=[
html.H1(children='Dash Tutorials'),
dcc.Graph()
])
if __name__ == '__main__':
app.run_server(debug=True)
How do I shut this down? My end goal is to run a plotly dashboard on a remote machine, but I'm testing it out on my local machine first.
I guess I'm supposed to "expose an endpoint" (have no idea what that means) via:
from flask import request
def shutdown_server():
func = request.environ.get('werkzeug.server.shutdown')
if func is None:
raise RuntimeError('Not running with the Werkzeug Server')
func()
#app.route('/shutdown', methods=['POST'])
def shutdown():
shutdown_server()
return 'Server shutting down...'
Where do I include the above code? Is it supposed to be included in the first block of code that I showed (i.e. the code that contains app.run_server command)? Is it supposed to be separate? And then what are the exact steps I need to take to shut down the server when I want?
Finally, are the steps to shut down the server the same whether I run the server on a local or remote machine?
Would really appreciate help!
The method in the linked answer, werkzeug.server.shutdown, only works with the development server. Creating a view function, with an assigned URL ("exposing an endpoint") to implement this shutdown function is a convenience thing, which won't work when deployed with a WSGI server like gunicorn.
Maybe that creates more questions than it answers:
I suggest familiarising yourself with Flask's wsgi-standalone deployment docs.
And then probably the gunicorn deployment guide. The monitoring section has a number of different examples of service monitors, which you can use with gunicorn allowing you to run the app in the background, start on reboot, etc.
Ultimately, starting and stopping the WSGI server is the responsibility of the service monitor and logic to do this probably shouldn't be coded into your app.
What works in both cases of
app.run_server(debug=True)
and
app.run_server(debug=False)
anywhere in the code is:
os.kill(os.getpid(), signal.SIGTERM)
(don't forget to import os and signal)
SIGTERM should cause a clean exit of the application.
First off, this is a follow up question from here: Change number of running spiders scrapyd
I'm used phantomjs and selenium to create a downloader middleware for my scrapy project. It works well and hasn't really slowed things down when I run my spiders one at a time locally.
But just recently I put a scrapyd server up on AWS. I noticed a possible race condition that seems to be causing errors and performance issues when more than one spider is running at once. I feel like the problem stems from two separate issues.
1) Spiders trying to use phantomjs executable at the same time.
2) Spiders trying to log to phantomjs's ghostdriver log file at the same time.
Guessing here, the performance issue may be the spider trying to wait until the resources are available (this could be due to the fact that I also had a race condition for an sqlite database as well).
Here are the errors I get:
exceptions.IOError: [Errno 13] Permission denied: 'ghostdriver.log' (log file race condition?)
selenium.common.exceptions.WebDriverException: Message: 'Can not connect to GhostDriver' (executable race condition?)
My questions are:
Does my analysis of what the problem(s) are seem correct?
Are there any known solutions to this problem other than limiting the number of spiders that can be ran at a time?
Is there some other way I should be handling javascript? (if you think I should create an entirely new question to discuss the best way to handle javascript with scrapy let me know and I will)
Here is my downloader middleware:
class JsDownload(object):
#check_spider_middleware
def process_request(self, request, spider):
if _platform == "linux" or _platform == "linux2":
driver = webdriver.PhantomJS(service_log_path='/var/log/scrapyd/ghost.log')
else:
driver = webdriver.PhantomJS(executable_path=settings.PHANTOM_JS_PATH)
driver.get(request.url)
return HtmlResponse(request.url, encoding='utf-8', body=driver.page_source.encode('utf-8'))
note: the _platform code is a temporary work around until I get this source code deployed into a static environment.
I found solutions on SO for javascript problem but they were spider based. This bothered me because it meant every request had to be made once in the downloader handler and again in the spider. That is why I decided to implement mine as a downloader middleware.
try using webdriver to interface with phantomjs
https://github.com/brandicted/scrapy-webdriver
I've noticed that on occasions where I've run my django project without the PostgreSQL server available, the errors produced are fairly cryptic and often appear to be generated by deep django internals as these are the functions actually connecting to the backend.
Is there a simple clean(and DRY) way to test the server is running.
Where is the best place to put project level start up checks?
You can register a signal on class-prepared.
https://docs.djangoproject.com/en/dev/ref/signals/#class-prepared
Than try executing custom sql directly.
https://docs.djangoproject.com/en/dev/topics/db/sql/#executing-custom-sql-directly
If it fails raise your custom exception.
import time
from django.db import connections
from django.db.utils import OperationalError
self.stdout.write('Waiting for database...')
db_conn = None
while not db_conn:
try:
db_conn = connections['default']
except OperationalError:
self.stdout.write('Database unavailable, waiting 1 second...')
time.sleep(1)
self.stdout.write(self.style.SUCCESS('Database available!'))
you can use this snippet where you need to.
befor accessing database and making any queries you must check if the database is up or not
I'm running some basic test code, with web.py and GAE (Windows 7, Python27). The form enables messages to be posted to the datastore. When I stop the app and run it again, any data posted previously has disappeared. Adding entities manually using the admin (http://localhost:8080/_ah/admin/datastore) has the same problem.
I tried setting the path in the Application Settings using Extra flags:
--datastore_path=D:/path/to/app/
(Wasn't sure about syntax there). It had no effect. I searched my computer for *.datastore, and couldn't find any files, either, which seems suspect, although the data is obviously being stored somewhere for the duration of the app running.
from google.appengine.ext import db
import web
urls = (
'/', 'index',
'/note', 'note',
'/crash', 'crash'
)
render = web.template.render('templates/')
class Note(db.Model):
content = db.StringProperty(multiline=True)
date = db.DateTimeProperty(auto_now_add=True)
class index:
def GET(self):
notes = db.GqlQuery("SELECT * FROM Note ORDER BY date DESC LIMIT 10")
return render.index(notes)
class note:
def POST(self):
i = web.input('content')
note = Note()
note.content = i.content
note.put()
return web.seeother('/')
class crash:
def GET(self):
import logging
logging.error('test')
crash
app = web.application(urls, globals())
def main():
app.cgirun()
if __name__ == '__main__':
main()
UPDATE:
When I run it via command line, I get the following:
WARNING 2012-04-06 19:07:31,266 rdbms_mysqldb.py:74] The rdbms API is not available because the MySQLdb library could not be loaded.
INFO 2012-04-06 19:07:31,778 appengine_rpc.py:160] Server: appengine.google.com
WARNING 2012-04-06 19:07:31,783 datastore_file_stub.py:513] Could not read datastore data from c:\users\amy\appdata\local\temp\dev_appserver.datastore
WARNING 2012-04-06 19:07:31,851 dev_appserver.py:3394] Could not initialize images API; you are likely missing the Python "PIL" module. ImportError: No module named _imaging
INFO 2012-04-06 19:07:32,052 dev_appserver_multiprocess.py:647] Running application dev~palimpsest01 on port 8080: http://localhost:8080
INFO 2012-04-06 19:07:32,052 dev_appserver_multiprocess.py:649] Admin console is available at: http://localhost:8080/_ah/admin
Suggesting that the datastore... didn't install properly?
As of 1.6.4, we stopped saving the datastore after every write. This method did not work when simulating the transactional model found in the High Replication Datastore (you would lose the last couple writes). It is also horribly inefficient. We changed it so the datastore dev stub flushes all writes and saves its state on shut down. It sounds like the dev_appserver is not shutting down correctly. You should see:
Applying all pending transactions and saving the datastore
in the logs when shutting down the server (see source code and source code). If you don't, it means that the dev_appserver is not being shut down cleanly (with a TERM signal or KeyInterrupt).
I'm trying to use redis to lock some of the big management Postgresql transaction I have in my project.
I haven't been successful so far on my development environment.
A simple version of the code would look like that:
def test_view(request):
connec = redis.Redis(unix_socket_path='/tmp/vgbet_redis.sock')
if not connec.setnx('test', ''):
print 'Locked'
else:
time.sleep(5) #Slow transaction
connec.delete('test')
print 'Unlocked'
return render_to_response("test.html")
If I open two tabs of that view, the first one print Unlocked after 5 seconds, then the second one prints Unlocked after 10 seconds. It looks like they are synchronous which doesn't make any sense to me.
Edit:
I have tried to install an apache and a gevent and I got the exact same results.
So I guess there is really something I don't understand with django + redis and my code is really wrong.
Any help would be great.
Edit2:
I just tried with django-redis by using redis as a cache.
CACHES = {
'default': {
'BACKEND': 'redis_cache.RedisCache',
'LOCATION': '/tmp/vgbet_redis.sock',
'OPTIONS': {
'DB': 1,
'PASSWORD': None,
'PARSER_CLASS': 'redis.connection.HiredisParser'
},
},
}
And I still have the same result if I open two tabs in my browser. The second view is blocked for 5 seconds, as if everything is synchronous.
from django.core.cache import cache
def test_view(request):
if cache.get('test') != None:
print 'Locked'
else:
cache.set('test', '', 60)
time.sleep(5) #Slow transaction
cache.delete('test')
return render_to_response("test.html")
If I open two terminals, I have no issue reading and writing in redis. So I really don't understand why I'm not able to use the cache in views.
A couple things to check:
The default Django development server is single-threaded, so can only handle one request at a time. The simplest way to test this would be to run the development server twice on different ports (ex, ./manage.py runserver 8080 and ./manage.py runserver 8081).
If you are using an SQL database at all, it might be blocking on a transaction. Are these the exact views you are using? Or are you doing anything with models?
You mentioned using gevent — were you sure to call from gevent import monkey; monkey.patch_all() to monkey patch everything? Also, how are you running your server with gevent?
The main reason for my issue, was because I was using two tabs on my browser. If I use two browsers or two different IP, my code works asynchronously (with gevent and apache, not with runserver but this isn't a surprise).
I think there is something like: if a unique session asks for the same view multiple times, they are served synchronously. I don't know if it's due to the server or django. I can't find anything like that on the documentation. If anyone knows, I would be really interested to understand that last part.