Vanilla Django Caching limitations and How to solve it
Why do we need Caching?
Caching is an important part of any backend application. It stores frequently used data in a faster sub-storage along with the Database, which means accessing cache data is quicker. Besides quicker access to the data, it also optimizes the Database by taking out a certain amount of Query load.
How to use Vanilla Django Cache?
In Django, there are multiple ways cache Basic, Per-view cache etc. Let's have a basic understanding of how django cache feature is used.
Basic cache
cache.set("example", "hello, world!", 30)
result = cache.get("my_key")
print(result) ## hello, world!
This is a basic example of storing and accessing data from Cache. This will store the "hello,world!" string in "example" key for 30 mins.
Per-view Cache
Now let's a have look at how to cache API views.
from django.views.decorators.cache import cache_page
from rest_framework.status import HTTP_404_NOT_FOUND
@cache_page(60 * 60)
def profile(request):
result = User.objects.get(id=request.user.id)
serializer = UserSerializer(result)
return Response(serializer.data)
In this example the result of this view will be cached for 1 hour or any given time range. This will store the Data to cache and serve it without querying the database for 1 hour.
The Problem
As you might have notice, there is no flow for updating the cache here. Which means Users requesting the data within the 1hour cache range, will be served the same copy of the data after it went to cache. So, you might have a question that how do we update the cache? Well there are certain ways to do it.
Cache the Query
You'll be hit by a similar idea after looking at above examples is just save the Query result to cache! This gets the job half done, let's see how.
## views.py
from django.core.cache import cache
from rest_framework.status import HTTP_404_NOT_FOUND
def profile(request):
## Check if cache exists
cache_result = cache.get(f"user_{request.user.id}")
if cache_result is None:
result = User.objects.get(id=request.user.id)
## Save the result
cache.set(f"user_{request.user.id}", result)
serializer = UserSerializer(result)
return Response(serializer.data)
serializer = UserSerializer(cache_result)
return Response(serializer.data)
## models.py
from django.db import models
class User(models.Model):
username = models.CharField(max_length=30)
email = models.EmailField()
def save(self, *args, **kwargs):
super(User, self).save(*args, **kwargs)ß
cache.set(f"user_{self.id}", self)
Here, in the viewset cache data is served if available and in the model save whenever there is a new User or a User is updated the cache is also updated. This way always the updated data will be served from cache.
The limitations here is only single User data is cached. It is not possible to cache larger data sets. Along with other limitations like filter and sorting is possible if queryset is cached.
How to solve it?
In this solution, we will redis-om package to manage our cached data. This package contains validations, filter, sorting etc. Let's setup the Cache service for this.
## libs/services.py
import logging
# django imports
from rest_framework import serializers
logger = logging.getLogger(__name__)
def dynamic_serializer(instance):
class DynamicSerializer(serializers.ModelSerializer):
class Meta:
model = instance
fields = "__all__"
return DynamicSerializer
class CacheModelService:
"""
Class for CRUD operations using redis_om.
...
Attributes
----------
model : JsonModel
redis_om model
Methods
-------
get(id):
get redis json obj by id.
"""
def __init__(self, model):
"""
model: redis_om JSON model
"""
self.model = model
def get(self, id):
"""
get redis json obj by id.
Parameters
----------
id : instance id
"""
obj = self.model.get(id).dict()
return obj
def set(self, instance, sender):
"""
saves given instance to cache
Parameters
----------
instance : instance to cache
fields : required fields to save
sender : main model that is cached
Returns
-------
saved object
"""
try:
serializer = dynamic_serializer(sender)
result = serializer(instance).data
fields = list(self.model.__annotations__.keys())
mapping = {
field: result.get(field)
for field in fields
if result.get(field) is not None
}
# Create the object
obj = self.model(**mapping)
obj.save()
logger.info(f"{sender} object created")
return obj
except:
logger.error(f"Failed to set {sender} object", exc_info=True)
def delete(self, primary_key):
"""
deletes cache objects
Parameters
----------
pk | [pk] : object primary key
"""
try:
if isinstance(primary_key, list):
for pk in primary_key:
self.model.delete(pk)
else:
self.model.delete(primary_key)
logger.info(f"Cache object {primary_key} deleted")
except:
logger.error(f"Failed to delete Cache object {pk}", exc_info=True)
def update(self, instance, sender):
"""
updates cache object
Parameters
----------
instance : instance to cache
fields : required to update
sender : main model that is cached
"""
try:
id = instance.id
primary_keys = self.get_pk(id)
serializer = dynamic_serializer(sender)
fields = list(self.model.__annotations__.keys())
# Get the required fields
result = serializer(instance).data
mapping = {field: result.get(field) for field in fields}
# Remove pK & None Values: sending pK creates new Instance
mapping = {
key: value for key, value in mapping.items() if value is not None
}
mapping.pop("pk", None)
for primary_key in primary_keys:
self.patch(primary_key, mapping)
logger.info(f"Cache object {id} updated")
except:
logger.error(f"Failed to update Cache object {id}", exc_info=True)
With this service layer we can efficiently do CRUD operations on cache model.
Now let's add the helper methods to set up the Automatic cache updates whenever there is a new Record is created or an existing record is updated.
## libs/helpers.py
import logging
# python imports
from functools import wraps
# app imports
from libs.cache.services import CacheModelService
def convert_to_dict(data):
return list(map(dict, data))
def cache_filter(model, filters):
"""
filters based on given expressions
Returns
-------
serialized data
"""
if isinstance(filters, list):
query = list()
query = [condition for condition in filters if condition.right is not None]
queryset = model.find(*query).all()
else:
queryset = model.find(filters).all()
serialize = convert_to_dict(queryset)
return serialize
def cache_update(cls):
"""
Add this decorator to cache models
Required details: CACHE_MODEL
"""
original_save = cls.save
@wraps(original_save)
def new_save(instance, *args, **kwargs):
created = instance._state.adding
# Call the original save method
result = original_save(instance, *args, **kwargs)
# Update Cache data
if created:
cache_model_service = CacheModelService(cls.CACHE_MODEL)
cache_model_service.set(instance, cls)
else:
cache_model_service = CacheModelService(cls.CACHE_MODEL)
cache_model_service.update(instance, cls)
return result
cls.save = new_save
return cls
All set, We will use the same example as before to see how this setup helps to use cache effeciently.
## models.py
from django.db import models
from typing import Optional
from redis_om import (
JsonModel,
Field,
)
from libs.helpers import cache_update
@cache_update
class User(models.Model):
username = models.CharField(max_length=30)
email = models.EmailField()
class UserCacheModel(JsonModel):
"""
User cache model : users.model.User
fields are stored for quick access
searchable fields are indexed
"""
id: int = Field(index=True)
email: Optional[str]
username: str = Field(index=True, full_text_search=True)
The UserCacheModel model will contain the required fields and also the fields that requires filters and sortings. Let's build the API views with the new cache set up integrated.
## views.py
from django.core.cache import cache
from rest_framework.status import HTTP_404_NOT_FOUND
from .models import UserCacheModel, User
from libs.helpers import cache_filter
from libs.services import CacheModelService
def profile(request):
## Check if cache exists
cache_service = CacheModelService(User)
user = cache_service.get(request.user.id)
return Response(user)
def users(self, request):
username = request.GET.get("username", None)
filters = [UserCacheModel.username % username]
result = cache_filter(UserCacheModel, filters)
Response(result)
This setup helps to efficiently use and manage the cached data. With features like filters, full-text-search, sorting and more, it can be used in other ways apart from API's. This is the initial setup and basic use case example.
Thanks for reading ✌🏻