I'm trying to add caching to my Python Flask application.
I did what Flask-Caching pages suggest, so I have this in app module:
config = {
# "DEBUG": True,
"env": 'dev',
"secret_key": 'my secret stuff',
"CACHE_TYPE": "simple",
"CACHE_DEFAULT_TIMEOUT": 300
}
app = Flask(__name__)
app.config.from_mapping(config)
cors = CORS(app, resources={"/api/*": {"origins": "*"}})
cache = Cache(app)
cache.init_app(app)
#app.before_first_request
def init_resources():
Database.initialize()
app.register_blueprint(auth_api_blueprint, url_prefix="/api/auth")
app.register_blueprint(user_api_blueprint, url_prefix="/api/user")
app.register_blueprint(year_api_blueprint, url_prefix="/api/year")
app.register_blueprint(notice_api_blueprint, url_prefix="/api/notice")
app.register_blueprint(event_api_blueprint, url_prefix="/api/event")
app.register_blueprint(admins_api_blueprint, url_prefix="/api/admin")
app.register_blueprint(guardian_api_blueprint, url_prefix="/api/guardian")
app.register_blueprint(employee_api_blueprint, url_prefix="/api/employee")
app.register_blueprint(student_api_blueprint, url_prefix="/api/student")
app.register_blueprint(teacher_api_blueprint, url_prefix="/api/teacher")
if __name__ == '__main__':
with app.app_context():
cache.clear()
# app.run(port=8080) - port does not work here, it is still default 5000
app.run()
then I applied cached decorator to the method like this:
from common.database import Database
from common.decorators import requires_login
year_api_blueprint = Blueprint('api/year', __name__)
from src.app import cache
#year_api_blueprint.route('/all')
#cache.cached(timeout=500, key_prefix="years")
# #requires_login - this need to be public
def get_all_years():
data = Database.find("years", {})
if data is not None:
return jsonify([year for year in data])
all seems to work fine and year above are not longer called many times (just once)
However I am getting this error every time when cached years are used:
127.0.0.1 - - [20/Oct/2020 17:36:32] "OPTIONS /api/year/all HTTP/1.1" 200 -
Exception possibly due to cache backend.
Traceback (most recent call last):
File "/home/smoczyna/Python-Projects/SchoolMateAPI/venv/lib/python3.6/site-packages/flask_caching/__init__.py", line 435, in decorated_function
rv = self.cache.get(cache_key)
File "/home/smoczyna/Python-Projects/SchoolMateAPI/venv/lib/python3.6/site-packages/flask_caching/__init__.py", line 244, in cache
return app.extensions["cache"][self]
KeyError: <flask_caching.Cache object at 0x7ff585e47358>
I've seen similar post here but I don't understand the solution and don't know how to apply that. I have nothing more than this about cache in app.
So my question is what is possibly missing or misconfigured here ?
I am developing spider project and I have moved to a new computer. Now I am installing everything and I encounter problem with Twisted. I have read about this bug and I have installed pywin32 and then also WinPython, but it doesn't help. I have tried to update Twisted with this command
pip install Twisted --update
as advised in the forum, but it says that pip install doesn't have --update option. I have also run
python python27\scripts\pywin32_postinstall.py -install
but with no success. This is my error:
G:\Job_vacancies\Python\vacancies>scrapy crawl jobs
2015-10-06 09:12:53 [scrapy] INFO: Scrapy 1.0.3 started (bot: vacancies)
2015-10-06 09:12:53 [scrapy] INFO: Optional features available: ssl, http11
2015-10-06 09:12:53 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'va
cancies.spiders', 'SPIDER_MODULES': ['vacancies.spiders'], 'DEPTH_LIMIT': 3, 'BO
T_NAME': 'vacancies'}
2015-10-06 09:12:53 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsol
e, LogStats, CoreStats, SpiderState
Unhandled error in Deferred:
2015-10-06 09:12:53 [twisted] CRITICAL: Unhandled error in Deferred:
Traceback (most recent call last):
File "c:\python27\lib\site-packages\scrapy\cmdline.py", line 150, in _run_comm
and
cmd.run(args, opts)
File "c:\python27\lib\site-packages\scrapy\commands\crawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
File "c:\python27\lib\site-packages\scrapy\crawler.py", line 153, in crawl
d = crawler.crawl(*args, **kwargs)
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1274, in
unwindGenerator
return _inlineCallbacks(None, gen, Deferred())
--- <exception caught here> ---
File "c:\python27\lib\site-packages\twisted\internet\defer.py", line 1128, in
_inlineCallbacks
result = g.send(result)
File "c:\python27\lib\site-packages\scrapy\crawler.py", line 71, in crawl
self.engine = self._create_engine()
File "c:\python27\lib\site-packages\scrapy\crawler.py", line 83, in _create_en
gine
return ExecutionEngine(self, lambda _: self.stop())
File "c:\python27\lib\site-packages\scrapy\core\engine.py", line 66, in __init
__
self.downloader = downloader_cls(crawler)
File "c:\python27\lib\site-packages\scrapy\core\downloader\__init__.py", line
65, in __init__
self.handlers = DownloadHandlers(crawler)
File "c:\python27\lib\site-packages\scrapy\core\downloader\handlers\__init__.p
y", line 23, in __init__
cls = load_object(clspath)
File "c:\python27\lib\site-packages\scrapy\utils\misc.py", line 44, in load_ob
ject
mod = import_module(module)
File "c:\python27\lib\importlib\__init__.py", line 37, in import_module
__import__(name)
File "c:\python27\lib\site-packages\scrapy\core\downloader\handlers\s3.py", li
ne 6, in <module>
from .http import HTTPDownloadHandler
File "c:\python27\lib\site-packages\scrapy\core\downloader\handlers\http.py",
line 5, in <module>
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python27\lib\site-packages\scrapy\core\downloader\handlers\http11.py"
, line 15, in <module>
from scrapy.xlib.tx import Agent, ProxyAgent, ResponseDone, \
File "c:\python27\lib\site-packages\scrapy\xlib\tx\__init__.py", line 3, in <m
odule>
from twisted.web import client
File "c:\python27\lib\site-packages\twisted\web\client.py", line 42, in <modul
e>
from twisted.internet.endpoints import TCP4ClientEndpoint, SSL4ClientEndpoin
t
File "c:\python27\lib\site-packages\twisted\internet\endpoints.py", line 34, i
n <module>
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python27\lib\site-packages\twisted\internet\stdio.py", line 30, in <m
odule>
from twisted.internet import _win32stdio
File "c:\python27\lib\site-packages\twisted\internet\_win32stdio.py", line 7,
in <module>
import win32api
exceptions.ImportError: DLL load failed: The specified module could not be found
.
2015-10-06 09:12:53 [twisted] CRITICAL:
And this is my code:
#!/usr/bin/python
# -*- coding: utf-8 -*-
# encoding=UTF-8
import scrapy, urlparse
from scrapy.http import Request
from scrapy.utils.response import get_base_url
from urlparse import urlparse, urljoin
from vacancies.items import JobItem
#We need that in order to force Slovenian pages instead of English pages. It happened at "http://www.g-gmi.si/gmiweb/" that only English pages were found and no Slovenian.
#from scrapy.conf import settings
#settings.overrides['DEFAULT_REQUEST_HEADERS'] = {'Accept':'text/html,application/xhtml+xml;q=0.9,*/*;q=0.8','Accept-Language':'sl',}
#settings.overrides['DEFAULT_REQUEST_HEADERS'] = {'Accept':'text/html,application/xhtml+xml;q=0.9,*/*;q=0.8','Accept-Language':'sl','en':q=0.8,}
class JobSpider(scrapy.Spider):
name = "jobs"
#Test sample of SLO companies
start_urls = [
"http://www.g-gmi.si/gmiweb/",
]
#Result of the programme is this list of job vacancies webpages.
jobs_urls = []
def parse(self, response):
response.selector.remove_namespaces()
#We take all urls, they are marked by "href". These are either webpages on our website either new websites.
urls = response.xpath('//#href').extract()
#Base url.
base_url = get_base_url(response)
#Loop through all urls on the webpage.
for url in urls:
#If url represents a picture, a document, a compression ... we ignore it. We might have to change that because some companies provide job vacancies information in PDF.
if url.endswith((
#images
'.jpg', '.jpeg', '.png', '.gif', '.eps', '.ico', '.svg', '.tif', '.tiff',
'.JPG', '.JPEG', '.PNG', '.GIF', '.EPS', '.ICO', '.SVG', '.TIF', '.TIFF',
#documents
'.xls', '.ppt', '.doc', '.xlsx', '.pptx', '.docx', '.txt', '.csv', '.pdf', '.pd',
'.XLS', '.PPT', '.DOC', '.XLSX', '.PPTX', '.DOCX', '.TXT', '.CSV', '.PDF', '.PD',
#music and video
'.mp3', '.mp4', '.mpg', '.ai', '.avi', '.swf',
'.MP3', '.MP4', '.MPG', '.AI', '.AVI', '.SWF',
#compressions and other
'.zip', '.rar', '.css', '.flv', '.php',
'.ZIP', '.RAR', '.CSS', '.FLV', '.PHP',
)):
continue
#If url includes characters like ?, %, &, # ... it is LIKELY NOT to be the one we are looking for and we ignore it.
#However in this case we exclude good urls like http://www.mdm.si/company#employment
if any(x in url for x in ['?', '%', '&', '#']):
continue
#Ignore ftp.
if url.startswith("ftp"):
continue
#We need to save original url for xpath, in case we change it later (join it with base_url)
url_xpath = url
#If url doesn't start with "http", it is relative url, and we add base url to get absolute url.
# -- It is true, that we may get some strange urls, but it is fine for now.
if not (url.startswith("http")):
url = urljoin(base_url,url)
#We don't want to go to other websites. We want to stay on our website, so we keep only urls with domain (netloc) of the company we are investigating.
if (urlparse(url).netloc == urlparse(base_url).netloc):
#The main part. We look for webpages, whose urls include one of the employment words as strings.
# -- Instruction.
# -- Users in other languages, please insert employment words in your own language, like jobs, vacancies, career, employment ... --
if any(x in url for x in [
'zaposlovanje',
'Zaposlovanje',
'zaposlitev',
'Zaposlitev',
'zaposlitve',
'Zaposlitve',
'zaposlimo',
'Zaposlimo',
'kariera',
'Kariera',
'delovna-mesta',
'delovna_mesta',
'pridruzi-se',
'pridruzi_se',
'prijava-za-delo',
'prijava_za_delo',
'oglas',
'Oglas',
'iscemo',
'Iscemo',
'careers',
'Careers',
'jobs',
'Jobs',
'employment',
'Employment',
]):
#This is additional filter, suggested by Dan Wu, to improve accuracy. We will check the text of the url as well.
texts = response.xpath('//a[#href="%s"]/text()' % url_xpath).extract()
#1. Texts are empty.
if texts == []:
print "Ni teksta za url: " + str(url)
#We found url that includes one of the magic words and also the text includes a magic word.
#We check url, if we have found it before. If it is new, we add it to the list "jobs_urls".
if url not in self.jobs_urls:
self.jobs_urls.append(url)
item = JobItem()
#item["text"] = text
item["url"] = url
#We return the item.
yield item
# 2. There are texts, one or more.
else:
#For the same partial url several texts are possible.
for text in texts:
if any(x in text for x in [
'zaposlovanje',
'Zaposlovanje',
'zaposlitev',
'Zaposlitev',
'zaposlitve',
'Zaposlitve',
'zaposlimo',
'Zaposlimo',
'ZAPOSLIMO',
'kariera',
'Kariera',
'delovna-mesta',
'delovna_mesta',
'pridruzi-se',
'pridruzi_se',
'oglas',
'Oglas',
'iscemo',
'Iscemo',
'ISCEMO',
'careers',
'Careers',
'jobs',
'Jobs',
'employment',
'Employment',
]):
#We found url that includes one of the magic words and also the text includes a magic word.
#We check url, if we have found it before. If it is new, we add it to the list "jobs_urls".
if url not in self.jobs_urls:
self.jobs_urls.append(url)
item = JobItem()
item["text"] = text
item["url"] = url
#We return the item.
yield item
#We don't put "else" sentence because we want to further explore the employment webpage to find possible new employment webpages.
#We keep looking for employment webpages, until we reach the DEPTH, that we have set in settings.py.
yield Request(url, callback = self.parse)
# We run the programme in the command line with this command:
# scrapy crawl jobs -o jobs.csv -t csv --logfile log.txt
# We get two output files
# 1) jobs.csv
# 2) log.txt
# Then we manually put one of employment urls from jobs.csv into read.py
I would be glad if you could give some advice on how to run this thing. Thank you, Marko
You should always install stuff into a virtualenv. Once you've got a virtualenv and it's active, do:
pip install --upgrade twisted pypiwin32
and you should get the depenendency that makes Twisted support stdio on the Windows platform.
To get all the goodies you might try
pip install --upgrade twisted[windows_platform]
but you may run into problems with gmp.h if you try that, and you don't need most of it to do what you're trying to do.
I have five different Django projects all running on one box with one installation of RabbitMQ. I use celery for various tasks. Each project appears to be receiving tasks meant for other projects.
Each codebase has it's own virtual environment where something like the following is run:
./manage.py celeryd --concurrency=2 --queues=high_priority
The parameters in each settings.py look like the following:
CELERY_SEND_EVENTS = True
CELERY_TASK_RESULT_EXPIRES = 10
CELERY_RESULT_BACKEND = 'amqp'
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
CELERY_TIMEZONE = 'UTC'
BROKER_URL = 'amqp://guest#127.0.0.1:5672//'
BROKER_VHOST = 'specific_app_name'
I'm seeing tracebacks that make me think apps are receiving each other's messages when they shouldn't be:
Traceback (most recent call last):
File "/home/.../.virtualenvs/.../local/lib/python2.7/site-packages/kombu/messaging.py", line 556, in _receive_callback
decoded = None if on_m else message.decode()
File "/home/.../.virtualenvs/.../local/lib/python2.7/site-packages/kombu/transport/base.py", line 147, in decode
self.content_encoding, accept=self.accept)
File "/home/.../.virtualenvs/.../local/lib/python2.7/site-packages/kombu/serialization.py", line 187, in decode
return decode(data)
File "/home/.../.virtualenvs/.../local/lib/python2.7/site-packages/kombu/serialization.py", line 74, in pickle_loads
return load(BytesIO(s))
ImportError: No module named emails.models
The emails.models module in this case appears in one project but not the others. Yet the others are showing this traceback.
I haven't look at multiple node names or anything like that. Would something like that fix this problem?
Your AMQP settings in celeryconfig.py are wrong. You are using:
BROKER_URL = 'amqp://guest#127.0.0.1:5672//'
BROKER_VHOST = 'specific_app_name'
The BROKER_VHOST parameter is ignored because BROKER_URL is present (also it is deprecated). If you want to use virtualhosts (which by the way is the preferred way to solve the problem you presented) you should create a virtualhost for each app and use the following in each app settings:
BROKER_URL = 'amqp://guest#127.0.0.1:5672//specific_app_name'
edited: fixed missing /
You should specify different queue settings for each of the projects. For example:
CELERY_QUEUES = {
"celery": {
"exchange": "project1_celery",
"binding_key": "project1_celery"},
}
CELERY_DEFAULT_QUEUE = "celery"
For the second project you specify exchange and binding_key as project2_celery and so on.
The code I posted is for Celery<3.0. If you are using a newer version, it would probably look like the following (I haven't used the new versions myself yet, so I'm not sure):
from kombu import Exchange, Queue
CELERY_DEFAULT_QUEUE = 'celery'
CELERY_QUEUES = (
Queue('celery', Exchange('project1_celery'), routing_key='project1_celery'),
)
You can read more in the celery docs: http://docs.celeryproject.org/en/latest/userguide/routing.html
With multiple django projects sharing the same box, you need to explicitly "namespace" #tasks per project. The error msg returned a namespace of "emails.models", thats not unique to any one project.
For example, if one project is name "project1", and another "project2", just add "name=" parameters to the #task decorators:
# project1
# emails.py
#tasks(name=project1.emails.my_email_function, queue=high_priority)
def my_email_function(user_id):
return [x of x in user_id if spam()]
# project2
# tasks.py
#tasks(name=project2.tasks.my_task_function, queue=high_priority)
def my_task_function(user_id):
return [x of x in user_id if blahblah()]
I'd like to use Buildout to get django-registration with Django 1.5, and I have a custom user using MyUser(AbstractUser). I used to get it from the recipe v0.8 and it was great. Since, I switched to 1.5, remove my UserProfile and use this custom MyUser.
django-registration no longer work. I've been told I should get it from the repository, and that's what I'm trying to do. I've in the past used mr.developer to get a newer version of django-tastypie, and I tried to reproduce the same with django-registration. I have an error while calling bin/buildout tho. Let's first check the buildout config:
[buildout]
extensions = mr.developer
parts = myquivers
eggs =
django-registration
include-site-packages = false
versions = versions
sources = sources
auto-checkout =
django-registration
[sources]
django-registration = hg https://bitbucket.org/ubernostrum/django-registration
[versions]
django = 1.5
[myquivers]
recipe = djangorecipe
settings = development
eggs = ${buildout:eggs}
project = myquivers
Pretty simple config. It used to work with tastypie like I said, and I'm trying to do the same steps:
- python2.7 bootstrap.py
- bin/buildout
- bin/develop activate django-registration
- bin/develop checkout django-registration
- bin/myquivers syncdb
- bin/myquivers runservser
But it fails at the bin/buildout steps:
$ bin/buildout
Getting distribution for 'mr.developer'.
Got mr.developer 1.25.
mr.developer: Creating missing sources dir /home/damien/Documents/projects/myquivers/src.
mr.developer: Queued 'django-registration' for checkout.
mr.developer: Cloned 'django-registration' with mercurial.
Develop: '/home/damien/Documents/projects/myquivers/src/django-registration'
Traceback (most recent call last):
File "/tmp/tmpzLDggG", line 13, in <module>
exec(compile(open('/home/damien/Documents/projects/myquivers/src/django-registration/setup.py').read(), '/home/damien/Documents/projects/myquivers/src/django-registration/setup.py', 'exec'))
File "/home/damien/Documents/projects/myquivers/src/django-registration/setup.py", line 30, in <module>
version=get_version().replace(' ', '-'),
File "/home/damien/Documents/projects/myquivers/src/django-registration/registration/__init__.py", line 5, in get_version
from django.utils.version import get_version as django_get_version
ImportError: No module named django.utils.version
While:
Installing.
Processing develop directory '/home/damien/Documents/projects/myquivers/src/django-registration'.
An internal error occured due to a bug in either zc.buildout or in a
recipe being used:
Traceback (most recent call last):
File "/home/damien/Documents/projects/myquivers/eggs/zc.buildout-2.1.0-py2.7.egg/zc/buildout/buildout.py", line 1923, in main
getattr(buildout, command)(args)
File "/home/damien/Documents/projects/myquivers/eggs/zc.buildout-2.1.0-py2.7.egg/zc/buildout/buildout.py", line 466, in install
installed_develop_eggs = self._develop()
File "/home/damien/Documents/projects/myquivers/eggs/zc.buildout-2.1.0-py2.7.egg/zc/buildout/buildout.py", line 707, in _develop
zc.buildout.easy_install.develop(setup, dest)
File "/home/damien/Documents/projects/myquivers/eggs/zc.buildout-2.1.0-py2.7.egg/zc/buildout/easy_install.py", line 871, in develop
call_subprocess(args)
File "/home/damien/Documents/projects/myquivers/eggs/zc.buildout-2.1.0-py2.7.egg/zc/buildout/easy_install.py", line 129, in call_subprocess
% repr(args)[1:-1])
Exception: Failed to run command:
'/usr/bin/python2.7', '/tmp/tmpzLDggG', '-q', 'develop', '-mxN', '-d', '/home/damien/Documents/projects/myquivers/develop-eggs/tmpYM_dR9build'
Checking at the error, first Django seems not to be in the system, and that's right, when entering python2.7 and try >>> import django, it fails. But that's normal and that's why I'm using buildout, to not install system-wide Django, just locally for my project.
Any idea how to fix this? Is there a better alternative than taking this repo version? Please let me know, again, custom user/django 1.5/django-registration.
Thanks!