Django cache how to remove key prefix of ":1:" - django

I set a cache in Django as below:
from django.core.cache import cache
...
cache.set("cae9ad31b9206a1b5594813b509e1003", "test", timeout=10)
It generates a key like this:
:1:cae9ad31b9206a1b5594813b509e1003
How to remove :1: prefix from the key?

You don't have to worry about it, really, as this doesn't affect how you get the value of a key.
cache.get("cae9ad31b9206a1b5594813b509e1003")
# outputs
"test"
Why's this happening?
Django generates the cache keys by combining the key you give it with the version of the cache.
Example:
cache.set("my_key", "value", version=2)
# becomes
":2:my-key"
Since, by default version=1, that is why in your case it becomes :1:cae9a....
This is called Cache Versioning. It is useful because this way you can have multiple cached versions of a particular object.
How to override this?
If you still want to override this behaviour for whatever reason, you can do it as the docs suggest.
First create a function somewhere like this:
def my_key_maker(key, key_prefix, version):
return key # just return the key without doing anything
Then, in your CACHES settings do this:
CACHES = {
"default": {
"BACKEND": ...,
# other settings ...
"KEY_FUNCTION": "path.to.my_key_maker"
}
}

Related

Django Sessions via Memcache: Cannot find session key manually

I recently migrated from database backed sessions to sessions stored via memcached using pylibmc.
Here is my CACHES, SESSION_CACHE_ALIAS & SESSION_ENGINE in my settings.py
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache',
'LOCATION': ['127.0.0.1:11211'],
}
}
SESSION_CACHE_ALIAS = 'default'
SESSION_ENGINE = "django.contrib.sessions.backends.cache"
Everything is working fine behind the scenes and I can see that it is using the new caching system. Running the get_stats() method from pylibmc shows me the number of current items in the cache and I can see that it has gone up by 1.
The issue is I'm unable to grab the session manually using pylibmc.
Upon inspecting the request session data in views.py:
def my_view(request):
if request.user.is_authenticated():
print request.session.session_key
# the above prints something like this: "1ay2kcv7axb3nu5fwnwoyf85wkwsttz9"
print request.session.cache_key
# the above prints something like this: "django.contrib.sessions.cache1ay2kcv7axb3nu5fwnwoyf85wkwsttz9"
return HttpResponse(status=200)
else:
return HttpResponse(status=401)
I noticed that when printing cache_key, it prints with the default KEY_PREFIX whereas for session_key it didn't. Take a look at the comments in the code to see what I mean.
So I figured, "Ok great, one of these key names should work. Let me try grabbing the session data manually just for educational purposes":
import pylibmc
mc = pylibmc.Client(['127.0.0.1:11211'])
# Let's try key "1ay2kcv7axb3nu5fwnwoyf85wkwsttz9"
mc.get("1ay2kcv7axb3nu5fwnwoyf85wkwsttz9")
Hmm nothing happens, no key exists by that name. Ok no worries, let's try the cache_key then, that should definitely work right?
mc.get("django.contrib.sessions.cache1ay2kcv7axb3nu5fwnwoyf85wkwsttz9")
What? How am I still getting nothing back? As I test I decide to set and get a random key value to see if it works and it does. I run get_stats() again just to make sure that the key does exist. I also test the web app to see if indeed my session is working and it does. So this leads me to conclude that there is a different naming scheme that I'm unaware of.
If so, what is the correct naming scheme?
Yes, the cache key used internally by Django is, in general, different to the key sent to the cache backend (in this case pylibmc / memcached). Let us call these two keys the django cache key and the final cache key respectively.
The django cache key given by request.session.cache_key is for use with Django's low-level cache API, e.g.:
>>> from django.core.cache import cache
>>> cache.get(request.session.cache_key)
{'_auth_user_hash': '1ay2kcv7axb3nu5fwnwoyf85wkwsttz9', '_auth_user_id': u'1', '_auth_user_backend': u'django.contrib.auth.backends.ModelBackend'}
The final cache key on the other hand, is a composition of the key prefix, the django cache key, and the cache version number. The make_key function (from Django docs) below demonstrates how these three values are composed to generate this key:
def make_key(key, key_prefix, version):
return ':'.join([key_prefix, str(version), key])
By default, key_prefix is the empty string and version is 1.
Finally, by inspecting make_key we find that the correct final cache key to pass to mc.get is
:1:django.contrib.sessions.cache1ay2kcv7axb3nu5fwnwoyf85wkwsttz9
which has the form <KEY_PREFIX>:<VERSION>:<KEY>.
Note: the final cache key can be changed by defining KEY_FUNCTION in the cache settings.

Django Redis cache values

I have set the value to Redis server externally using python script.
r = redis.StrictRedis(host='localhost', port=6379, db=1)
r.set('foo', 'bar')
And tried to get the value from web request using django cache inside views.py.
from django.core.cache import cache
val = cache.get("foo")
It is returning None. But when I tries to get it form
from django_redis import get_redis_connection
con = get_redis_connection("default")
val = con.get("foo")
It is returning the correct value 'bar'. How cache and direct connections are working ?
Libraries usually use several internal prefixes to store keys in redis, in order not to be mistaken with user defined keys.
For example, django-redis-cache, prepends a ":1:" to every key you save into it.
So for example when you do r.set('foo', 'bar'), it sets the key to, ":1:foo". Since you don't know the prefix prepended to your key, you can't get the key using a normal get, you have to use it's own API to get.
r.set('foo', 'bar')
r.get('foo') # None
r.get(':1:foo') # bar
So in the end, it returns to the library you use, go read the code for it and see how it exactly saves the keys. redis-cli can be your valuable friend here. Basically set a key with cache.set('foo', 'bar'), and go into redis-cli and check with 'keys *' command to see what key was set for foo.

How to match redis key patterns using native django cache?

I have a series of caches which follow this pattern:
key_x_y = value
Like:
'key_1_3' = 'foo'
'key_2_5' = 'bar'
'key_1_7' = 'baz'
Now I'm wondering how can I iterate over all keys to match pattern like key_1_* to get foo and baz using the native django cache.get()?
(I know that there are way, particularly for redis, that allow using more extensive api like iterate, but I'd like to stick to vanilla django cache, if possible)
This is not possible using standard Django's cache wrapper. As the feature to search keys by pattern is a backend dependent operation and not supported by all the cache backends used by Django (e.g. memcached does not support it but Redis does). So you will have to use a custom cache wrapper with cache backend that supports this operation.
Edit:
If you are already using django-redis then you can do
from django.core.cache import cache
cache.keys("foo_*")
as explained here.
This will return list of keys matching the pattern then you can use cache.get_many() to get values for these keys.
cache.get_many(cache.keys("key_1_*"))
If the cache has following entries:
cache = {'key_1_3': 'foo', 'key_2_5': 'bar', 'key_1_7': 'baz'}
You can get all the entries which has key key_1_*:
x = {k: v for k, v in cache.items() if k.startswith('key_1')}
Based on the documentation from django-redis
You can list all the keys with a pattern:
>>> from django.core.cache import cache
>>> cache.keys("key_1_*")
# ["key_1_3", "key_1_7"]
once you have the keys you can get the values from this:
>>> [cache.get(k) for k in cache.keys("key_1_*")]
# ['foo', 'baz']
You can also use cache.iter_keys(pattern) for efficient implementation.
Or, as suggested by #Muhammad Tahir, you can use cache.get_many(cache.keys("key_1_*")) to get all the values in one go.
I saw several answers above mentioning django-redis.
Based on https://pypi.org/project/django-redis/
You can actually use delete_pattern() method
from django.core.cache import cache
cache.delete_pattern('key_1_*')

How to extend cache ttl (time-to-live) in django-redis?

I'm using django 1.5.4 and django-redis 3.7.1
I'd like to extend cache's ttl(time-to-live) when I retrieved it.
Here is sample code
from django.core.cache import cache
foo = cache.get("foo)
if not foo:
cache.set("foo", 1, timeout=100)
else:
// Extend Cache's Time-To-Live something like it
cache.ttl("foo") = 200
I tried to search this option at django-redis-docs, but I couldn't find it.
However, I noticed that designate time-to-live value for existing cache is available at redis native command like "Expire foo 100"
I know that using cache.set once more make same effect, but I'd like to use more simple method with time-to-live attribute.
To extend a ttl(time-to-live) of the django-redis cache record use expire(key, timeout)
Django-Redis: Expire & Persist
from django.core.cache import cache
cache.set("foo", 1, timeout=100)
cache.ttl("foo")
>>> 100
You cannot extend ttl(time-to-live) if the key already expired
if cache.ttl("foo") > 0:
cache.expire("foo", timeout=500)
cache.ttl("foo")
>>> 500
I solved this problem.
(1) Using 'Raw Client Access' and (2) Extend TTL value without overwriting
Please refer following code.
from redis_cache import get_redis_connection
con = get_redis_connection('default')
con.expire(":{DB NUMBER at settings.py}:" + "foo", 100)

Hierarchical cache in Django

What I want to do is to mark some values in the cache as related so I could delete them at once. For example when I insert a new entry to the database I want to delete everything in the cache which was based on the old values in database.
I could always use cache.clear() but it seems too brutal to me. Or I could store related values together in the dictionary and cache this dictionary. Or I could maintain some kind of index in an extra field in cache. But everything seems to complicated to me (eventually slow?).
What you think? Is there any existing solution? Or is my approach wrong? Thanks for answers.
Are you using the cache api? It sounds like it.
This post, which pointed me to these slides helped me create a nice generational caching system which let me create the hierarchy I wanted.
In short, you store a generation key (such as group) in your cache and incorporate the value stored into your key creation function so that you can invalidate a whole set of keys at once.
With this basic concept you could create highly complex hierarchies or just a simple group system.
For example:
class Cache(object):
def generate_cache_key(self, key, group=None):
"""
Generate a cache key relating them via an outside source (group)
Generates key such as 'group-1:KEY-your-key-here'
Note: consider this pseudo code and definitely incomplete code.
"""
key_fragments = [('key', key)]
if group:
key_fragments.append((group, cache.get(group, '1')))
combined_key = ":".join(['%s-%s' % (name, value) for name, value in key_fragments)
hashed_key = md5(combined_key).hexdigest()
return hashed_key
def increment_group(self, group):
"""
Invalidate an entire group
"""
cache.incr(group)
def set(self, key, value, group=None):
key = self.generate_cache_key(key, group)
cache.set(key, value)
def get(self, key, group=None):
key = self.generate_cache_key(key, group)
return cache.get(key)
# example
>>> cache = Cache()
>>> cache.set('key', 'value', 'somehow_related')
>>> cache.set('key2', 'value2', 'somehow_related')
>>> cache.increment_group('somehow_related')
>>> cache.get('key') # both invalidated
>>> cache.get('key2') # both invalidated
Caching a dict or something serialised (with JSON or the like) sounds good to me. The cache backends are key-value stores like memcache, they aren't hierarchical.