Can not cache queryset in Django - django

In a views I have this cache which is supposed to save some costly queries:
from django.core.cache import cache
LIST_CACHE_TIMEOUT = 120
....
topics = cache.get('forum_topics_%s' % forum_id)
if not topics:
topics = Topic.objects.select_related('creator') \
.filter(forum=forum_id).order_by("-created")
print 'forum topics not in cache', forum_id #Always printed out
cache.set('forum_topics_%s' % forum_id, topics, LIST_CACHE_TIMEOUT)
I don't have problem using this method to cache other queryset results and can not think of the reson of this strange behavior, so I appreciate your hints about this.

I figured out what caused this: memcache hash value can not be larger than 1mb.
So I switched to redis, and the problem was gone:
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://127.0.0.1:6379/1",
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
}
}
}
IMPORTANT: make sure that redis version is 2.6 or higher.
redis-server --version
In older versions of redis, apparently redis does not recognize key timeout parameter and throughs error. This tripped me a while because the default redis on Debian 7 was 2.4.

Related

django-redis persisting json data

I have a small django site which controls an anstronomy dome and house automation. On start up the project loads 3 json files: relays, conditions and homeautomation. To avoid constant reading and writing to the Pi4's ssd I load the json files into REDIS (on start up in apps, see below). I already have REDIS running in a docker as the project uses celery.
My problem is that within a few minutes of loading the json into REDIS it clears the data out of cache.
I load the json file in the form of a dictionary (dict) in apps
cache.set("REDIS_ashtreeautomation_dict", dict, timeout=None)
and set
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://redis:6379",
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
"SERIALIZER": "django_redis.serializers.json.JSONSerializer",
"TIMEOUT": None
}
}
}
I don't need the data to persist if the dockers go down and I don't need db functions. Caching these files is ideal but I need them to 'stay alive' for the lifetime of the server.
Thank you.
Thank you Kevin.
Moving TIMEOUT solved the issue.
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://redis:6379",
"TIMEOUT": None,
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
"SERIALIZER": "django_redis.serializers.json.JSONSerializer",
}
}
}
I am going to include some code to catch the long term REDIS 'eviction' policies (i.e. reload the json data). I don't want to delve into the REDIS docker.
Thanks
Ian

Shutting Off CELERY.BACKEND_CLEANUP on Amazon SQS

I'm using Django with Celery. I need to turn off the celery.backend_cleanup that runs every day at 4 UTC. I've been looking at the documentation and can't find how to disable it. Below is my last try:
celery.py
from __future__ import absolute_import, unicode_literals
from django.conf import settings
from celery import Celery
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE",
"settings")
app = Celery('app')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
app.conf.beat_schedule = {
'backend_cleanup': {
'task': 'celery.backend_cleanup',
'schedule': None,
'result_expires': None
},
}
I don't want this to run. How can I stop it?
UPDATE:
I also tried adding this to settings.py
CELERYBEAT_SCHEDULE = {
'backend_cleanup': {
'task': 'celery.backend_cleanup',
'schedule': 0,
'result_expires': 0
},
}
I know deleting task in db is an option, but if later on beat has to be restarted it creates the backend_cleanup again and it starts running it. I may not be the person maintaining this in the future, so I need this configured in the code not manually deleted from database.
Here are a few approaches you can try:
You could use a crontab definition that runs once far in the future, e.g.:
app.conf.beat_schedule = {
# Disable cleanup task by scheduling to run every ~1000 years
'backend_cleanup': {
'task': 'celery.backend_cleanup',
'schedule': timedelta(days=365*1000),
'relative': True,
},
}
You can try setting app.backend.supports_autoexpire = True, because this attribute is checked before the default backend_cleanup task is added (src).
You can create a subclass your backend class and set supports_autoexpire = True
Simply set the result_expires setting equal to None, but define it as a global setting in your settings.py, and not a setting bound to a particular task (eg. backend_cleanup in your example). You could also pass it into your app declaration directly.
The docs:
Default: Expire after 1 day.
Time (in seconds, or a timedelta object) for when after stored task tombstones will be deleted.
A built-in periodic task will delete the results after this time (celery.backend_cleanup), assuming that celery beat is enabled. The task runs daily at 4am.
A value of None or 0 means results will never expire (depending on backend specifications).
Note that this only works with the AMQP, database, cache, Couchbase, and Redis backends, and not on SQS!
The docs on SQS:
Warning
Don’t use the amqp result backend with SQS.
It will create one queue for every task, and the queues will not be collected. This could cost you money that would be better spent contributing an AWS result store backend back to Celery :)
You may be using django-celery-beat. if so, you'll have to delete the scheduled task from your database. You can do this through the django admin panel.

Configuring and using structlog with Django

Does anyone use structlog with Django? I'm looking for a code sample how can I integrate Django logging (which is done via standard library), and structlog.
I've tried the code from the "Rendering Using structlog-based Formatters Within logging" example, with only slightest modifications:
# From my settings.py, basically the same code as in the linked example
timestamper = structlog.processors.TimeStamper(fmt="%Y-%m-%d %H:%M:%S")
pre_chain = [
structlog.stdlib.add_log_level,
timestamper,
]
LOGGING = {
"version": 1,
"disable_existing_loggers": False,
"formatters": { ... }, # Exactly like in the linked example
"handlers": { ... }, # Ditto, but only "default" handler (no files)
"loggers": {
"django": {
"handlers": ["default"],
"level": "INFO",
},
# I also had "" logger here, with the same config as "django",
# but it's irrelevant for the example purposes.
}
}
# Same as in the example
structlog.configure(
processors=[
structlog.stdlib.add_log_level,
structlog.stdlib.PositionalArgumentsFormatter(),
timestamper,
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
],
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
wrapper_class=structlog.stdlib.BoundLogger,
cache_logger_on_first_use=True,
)
However, I end up with logging errors. This is an excerpt of what happens on a simple GET request that ends up with 404
TypeError: not all arguments converted during string formatting
...
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py", line 152, in get_response
extra={'status_code': 404, 'request': request},
Message: '\x1b[2m2017-05-08 18:34:53\x1b[0m [\x1b[33m\x1b[1mwarning \x1b[0m] \x1b[1mNot Found: /favicon.ico\x1b[0m'
Arguments: ('/favicon.ico',)
I've tried to figure out what exactly goes on but lost my way in the debugger.
Of course, I can use structlog just for application logging, and keep the standard library loggers like they are. However, I want all logging unified, so my application's output would be uniform, ready for parsing.
I'd greatly appreciate a code snippet that shows how to integrate structlog with Django correctly.
It’s most likely this bug that will be fixed in structlog 17.2 that should be released soonish: https://github.com/hynek/structlog/pull/117 (feel free to comment or to try out if it fixes your problem).

ClassCastException[java.lang.Long cannot be cast to java.lang.Double] in Elasticsearch when trying to sort by an additional field

I'm having an issue where Elasticsearch running on Amazon Elasticsearch Service (v1.5 and v2.3) keeps returning the following error:
{u'status': 503, u'error': u'ReduceSearchPhaseException[Failed to execute phase [query], [reduce] ]; nested: ClassCastException[java.lang.Long cannot be cast to java.lang.Double]; '}
Here's the elasticsearch query in python:
def search_es(q, indices):
es = get_search_client()
search_body = {
"sort": [ {'importance_score': 'desc'}, '_score', ],
"query": {
"match": {
"display_name": q
}
}
}
try:
res = es.search(index=indices, body=search_body)
except TransportError as e:
assert False, e.info
data = []
for hit in res['hits']['hits']:
data.append(hit['_source'])
return data
The code above runs perfectly when I have an elasticsearch docker container running locally. But it somehow fails in our test environment using AWS version of ES.
If I remove the bolded portion here, the query runs without errors (although with undesirable ordering):
"sort": [ {'importance_score': 'desc'}, '_score', ],
Which leads me to believe there's something up with that score. importance_score is just a normal key that's calculated prior to indexing, and it has a maximum value of 19. I've tried variations of it by casting it as float, int, and long before indexing. All of them work locally but return the same error on test env.
Upgrading to ES v2.3 changes the structure of the error message but returns essentially the same error.
What might be causing this? Thanks for the help.
Turns out, importance_score for different document types was getting dynamically mapped to different primitive types. Forcing all mappings to use long for importance_score fixed the issue.

Cache values not appearing in Redis

I've got Redis set up as my cache in django, with the following setting:
CACHES = {
'default': {
'BACKEND': 'redis_cache.RedisCache',
'LOCATION': 'localhost:6379',
'OPTIONS': {
'PICKLE_VERSION': 1,
},
},
}
And I'm experimenting with it (new to Redis, want to understand it better). So, I go into my Django shell, and I do:
from django.core.cache import cache
cache.set('asdf', 2)
cache.get('asdf') # Returns 2
And then I go into redis-cli, where I expect to see the value, but none of these show any values:
KEYS *
GET *
GET 'asdf'
What's up with that?
Redis has 16 databases by default. As #Bernhard says in his comment, you can see how many keys each has with:
INFO KEYSPACE
Which in my case returned:
# Keyspace
db0:keys=1,expires=0,avg_ttl=0
db1:keys=2,expires=2,avg_ttl=504748260
And you can SELECT the database you want to inspect with:
SELECT 1
At which point, sure enough, I can see the keys I expected:
KEYS *
1) ":1:asdf"
2) ":1:django.contrib.sessions.cacheg2l0bo9z88z8bn4q2ep0andjgo8zrzzk"