read-tech

Django Redis cache

July 21, 2024

Vanilla Django Caching limitations and How to solve it

Why do we need Caching?

Caching is an important part of any backend application. It stores frequently used data in a faster sub-storage along with the Database, which means accessing cache data is quicker. Besides quicker access to the data, it also optimizes the Database by taking out a certain amount of Query load.

How to use Vanilla Django Cache?

In Django, there are multiple ways cache Basic, Per-view cache etc. Let's have a basic understanding of how django cache feature is used.

Basic cache

cache.set("example", "hello, world!", 30)

result = cache.get("my_key")
print(result) ## hello, world!

This is a basic example of storing and accessing data from Cache. This will store the "hello,world!" string in "example" key for 30 mins.

Per-view Cache

Now let's a have look at how to cache API views.

from django.views.decorators.cache import cache_page
from rest_framework.status import HTTP_404_NOT_FOUND

@cache_page(60 * 60)
def profile(request):
    result = User.objects.get(id=request.user.id)
    serializer = UserSerializer(result)
    return Response(serializer.data)

In this example the result of this view will be cached for 1 hour or any given time range. This will store the Data to cache and serve it without querying the database for 1 hour.

The Problem

As you might have notice, there is no flow for updating the cache here. Which means Users requesting the data within the 1hour cache range, will be served the same copy of the data after it went to cache. So, you might have a question that how do we update the cache? Well there are certain ways to do it.

Cache the Query

You'll be hit by a similar idea after looking at above examples is just save the Query result to cache! This gets the job half done, let's see how.

## views.py
from django.core.cache import cache
from rest_framework.status import HTTP_404_NOT_FOUND

def profile(request):
    ## Check if cache exists
    cache_result = cache.get(f"user_{request.user.id}")
    if cache_result is None:
        result = User.objects.get(id=request.user.id)
        ## Save the result
        cache.set(f"user_{request.user.id}", result)
        serializer = UserSerializer(result)
        return Response(serializer.data)

    serializer = UserSerializer(cache_result)
    return Response(serializer.data)

## models.py
from django.db import models

class User(models.Model):
    username = models.CharField(max_length=30)
    email = models.EmailField()

    def save(self, *args, **kwargs):
        super(User, self).save(*args, **kwargs)ß
        cache.set(f"user_{self.id}", self)

Here, in the viewset cache data is served if available and in the model save whenever there is a new User or a User is updated the cache is also updated. This way always the updated data will be served from cache.

The limitations here is only single User data is cached. It is not possible to cache larger data sets. Along with other limitations like filter and sorting is possible if queryset is cached.

How to solve it?

In this solution, we will redis-om package to manage our cached data. This package contains validations, filter, sorting etc. Let's setup the Cache service for this.

## libs/services.py
import logging

# django imports
from rest_framework import serializers

logger = logging.getLogger(__name__)


def dynamic_serializer(instance):
    class DynamicSerializer(serializers.ModelSerializer):
        class Meta:
            model = instance
            fields = "__all__"

    return DynamicSerializer


class CacheModelService:
    """
    Class for CRUD operations using redis_om.
    ...

    Attributes
    ----------
    model : JsonModel
        redis_om model

    Methods
    -------
    get(id):
        get redis json obj by id.
    """

    def __init__(self, model):
        """
        model: redis_om JSON model
        """
        self.model = model

    def get(self, id):
        """
        get redis json obj by id.

        Parameters
        ----------
        id : instance id
        """
        obj = self.model.get(id).dict()
        return obj

    def set(self, instance, sender):
        """
        saves given instance to cache

        Parameters
        ----------
        instance : instance to cache
        fields : required fields to save
        sender : main model that is cached

        Returns
        -------
        saved object
        """
        try:
            serializer = dynamic_serializer(sender)
            result = serializer(instance).data

            fields = list(self.model.__annotations__.keys())

            mapping = {
                field: result.get(field)
                for field in fields
                if result.get(field) is not None
            }

            # Create the object
            obj = self.model(**mapping)
            obj.save()

            logger.info(f"{sender} object created")
            return obj
        except:
            logger.error(f"Failed to set {sender} object", exc_info=True)

    def delete(self, primary_key):
        """
        deletes cache objects

        Parameters
        ----------
        pk | [pk] : object primary key
        """
        try:
            if isinstance(primary_key, list):
                for pk in primary_key:
                    self.model.delete(pk)
            else:
                self.model.delete(primary_key)
            logger.info(f"Cache object {primary_key} deleted")
        except:
            logger.error(f"Failed to delete Cache object {pk}", exc_info=True)

    def update(self, instance, sender):
        """
        updates cache object

        Parameters
        ----------
        instance : instance to cache
        fields : required to update
        sender : main model that is cached
        """
        try:
            id = instance.id
            primary_keys = self.get_pk(id)

            serializer = dynamic_serializer(sender)

            fields = list(self.model.__annotations__.keys())

            # Get the required fields
            result = serializer(instance).data
            mapping = {field: result.get(field) for field in fields}

            # Remove pK & None Values: sending pK creates new Instance
            mapping = {
                key: value for key, value in mapping.items() if value is not None
            }
            mapping.pop("pk", None)

            for primary_key in primary_keys:
                self.patch(primary_key, mapping)

            logger.info(f"Cache object {id} updated")
        except:
            logger.error(f"Failed to update Cache object {id}", exc_info=True)

With this service layer we can efficiently do CRUD operations on cache model.

Now let's add the helper methods to set up the Automatic cache updates whenever there is a new Record is created or an existing record is updated.

## libs/helpers.py
import logging

# python imports
from functools import wraps

# app imports
from libs.cache.services import CacheModelService


def convert_to_dict(data):
    return list(map(dict, data))


def cache_filter(model, filters):
    """
    filters based on given expressions

    Returns
    -------
    serialized data
    """
    if isinstance(filters, list):
        query = list()
        query = [condition for condition in filters if condition.right is not None]
        queryset = model.find(*query).all()
    else:
        queryset = model.find(filters).all()

    serialize = convert_to_dict(queryset)
    return serialize


def cache_update(cls):
    """
    Add this decorator to cache models
    Required details: CACHE_MODEL
    """
    original_save = cls.save

    @wraps(original_save)
    def new_save(instance, *args, **kwargs):
        created = instance._state.adding

        # Call the original save method
        result = original_save(instance, *args, **kwargs)

        # Update Cache data
        if created:
            cache_model_service = CacheModelService(cls.CACHE_MODEL)
            cache_model_service.set(instance, cls)
        else:
            cache_model_service = CacheModelService(cls.CACHE_MODEL)
            cache_model_service.update(instance, cls)

        return result

    cls.save = new_save
    return cls

All set, We will use the same example as before to see how this setup helps to use cache effeciently.

## models.py
from django.db import models
from typing import Optional
from redis_om import (
    JsonModel,
    Field,
)
from libs.helpers import cache_update

@cache_update
class User(models.Model):
    username = models.CharField(max_length=30)
    email = models.EmailField()

class UserCacheModel(JsonModel):
    """
    User cache model : users.model.User
    fields are stored for quick access
    searchable fields are indexed
    """

    id: int = Field(index=True)
    email: Optional[str]
    username: str = Field(index=True, full_text_search=True)

The UserCacheModel model will contain the required fields and also the fields that requires filters and sortings. Let's build the API views with the new cache set up integrated.

## views.py
from django.core.cache import cache
from rest_framework.status import HTTP_404_NOT_FOUND
from .models import UserCacheModel, User
from libs.helpers import cache_filter
from libs.services import CacheModelService

def profile(request):
    ## Check if cache exists
    cache_service = CacheModelService(User)
    user = cache_service.get(request.user.id)
    return Response(user)

def users(self, request):
    username = request.GET.get("username", None)
    filters = [UserCacheModel.username % username]

    result = cache_filter(UserCacheModel, filters)
    Response(result)

This setup helps to efficiently use and manage the cached data. With features like filters, full-text-search, sorting and more, it can be used in other ways apart from API's. This is the initial setup and basic use case example.

Thanks for reading ✌🏻

Thanks for reading!