Welcome to the DjaoDjin Blog!

A place to share experiences in building Software-as-a-Service.

Multi-tier Implementation in Django

by Sebastien Mirolo on Thu, 16 Oct 2014

These are the notes from the lightning talk I gave at the SF Django Meetup on September 24th.

Three definitions of multi-tier

  • Templates
  • Databases connections
  • Url resolvers

Templates

In order to pick site-specific templates at runtime, we will create a custom template loader. Since the request object is not passed as a parameter to get_template_sources, we , we need to create a middleware and thread local to keep the information required.


$ cat multitier/middleware.py

from threading import local
from django.contrib.sites.models import get_current_site

_thread_locals = local()

class OrganizationMiddleware(object):

    def process_request(self, request):
        _thread_locals.site = get_current_site(request)
        return None

$ cat multitier/template_loader.py

from django.conf import settings
from django.template.loaders.filesystem import Loader as FilesystemLoader
from django.utils._os import safe_join

from .middleware import _thread_locals

class Loader(FilesystemLoader):

    def get_template_sources(self, template_name, template_dirs=None):
        try:
            if not template_dirs:
                template_dirs = settings.TEMPLATE_DIRS
            for theme in [_thread_locals.site.domain]:
                for template_dir in template_dirs:
                    try:
                        template_path = safe_join(
                            template_dir, theme, template_name)
                        yield template_path
                    except UnicodeDecodeError:
                        # The template dir name was a bytestring that wasn't
                        # valid UTF-8.
                        raise
                    except ValueError:
                        # The joined path was located outside
                        # of this particular template_dir (it might be
                        # inside another one, so this isn't fatal).
                        pass

        except AttributeError, attr_err:
            # Something bad appended. We don't even have a request.
            # The middleware might be misconfigured.
            raise RuntimeError(
                "%s, your middleware might be misconfigured.", attr_err)

$ diff -u prev settings.py
 TEMPLATE_LOADERS = (
+   'multitier.template_loader.Loader',
...
 MIDDLEWARE_CLASSES += (
+    'multitier.middleware.OrganizationMiddleware',
...

Databases connections

There are many useful articles on how to set Django database routers for multi-tier architectures:

These articles are have great pointers to get started. Armed with a middleware, thread local for the site and a database router, the only bit of trickery here is to update dynamically django.db.connections.databases. As noted many times elsewhere, Django is full of dubious caches and finding which variables to update in order to dynamically insert new databases connections involves reading a lot of the Django source code, trials and errors, and a bit of luck.

$ cat django/db/utils.py
...
DEFAULT_DB_ALIAS = 'default'
...
class ConnectionHandler(object):
    def __init__(self, databases=None):

    def __getitem__(self, alias):
        if hasattr(self._connections, alias):
            return getattr(self._connections, alias)   # <-- Look cache HERE
...

$ cat django/db/__init__.py:
...
connections = ConnectionHandler()
...

$ cat django/db/models/query.py:
...
connection = connections[self.db]
...

Sharding with the Django ORM and Django Routers are not directly related to multi-tier implementations but worth reading to understand the potential of Django database routers.

URL resolvers

Routing URLs to the appropriate function in Django looks like code that was written with good intention but never quite correctly to be useful, then abandoned in favor of another approach that wasn't quite right either. In the end, the code base is complex and still quite inflexible in many ways for no apparent reason except for a ruin of legacy trials and errors.

What we wanted to do here is to have some URLs which are global (i.e. /admin/) and some which are context-specific (i.e. /:organization/) but at the same time we did not want to resort on passing an organization arg to all the reverse() calls. After all, most of the Django Apps were designed with single-tier in mind. The multi-tier layer, templates, database routing and URL routing magic, was to be on top in the front-end server.

The simple approach

With context-specific URLs only, it is possible to rely on subdomains (organization.localhost.localdomain) and some middleware + thread-local magic.

A variant when subdomains are not a workable solution, or localhost.localdomain/:organization/ URLs are just preferred, is to use the prefix thread-local defined in django/core/urlresolvers.py.

...
def set_script_prefix(prefix):
    """
    Sets the script prefix for the current thread.
    """
    if not prefix.endswith('/'):
        prefix += '/'
    _prefixes.value = prefix
...

The hack approach

In our case, we had both, global (i.e. /admin/) and context-specific (i.e. /:organization/) URLs so the simple approach could not work.

Django URL namespaces look great on paper but they require in practice to pass the namespace to the reverse() call which means we could only use Django Apps that were designed with namespaces in mind - again, a non-starter.

At that point, the last bit of hope was in the i18n LocaleRegexURLResolver. That resolver worked exactly as we wanted to, so it was surely possible to emulate its class hierarchy to get our multi-tier URL resolver to work.

Unfortunately, a whole tangled mess of caching and global variables spread across RegexURLResolver, LocaleRegexURLResolver, etc. prevented us to have a clean implementation. In the end, we resolved to override DjangoTranslation (yes!):

$ cat multitier/urlresolvers.py

import re

from django.core.urlresolvers import RegexURLResolver
from django.conf.urls import patterns
from django.conf import settings
from django.utils.translation.trans_real import DjangoTranslation

from .middleware import _thread_locals

class OrganizationCode(DjangoTranslation):

    def __init__(self, *args, **kw):
        DjangoTranslation.__init__(self, *args, **kw)
        self._catalog = {}
        self.set_output_charset('utf-8')
        self.__language = 'en-us'

    def set_language(self, language):
        self.__language = language

    def language(self):
        return self.__language

    def to_language(self):
        site = _thread_locals.site
        if site:
            # site will be None when 'manage.py show_urls' is invoked.
            return site.domain
        return 'en-us'

class OrganizationRegexURLResolver(RegexURLResolver):
    """
    A URL resolver that always matches the active organization code
    as URL prefix.
    """
    def __init__(self, urlconf_name,
                 default_kwargs=None, app_name=None, namespace=None):
        super(OrganizationRegexURLResolver, self).__init__(
            None, urlconf_name, default_kwargs, app_name, namespace)

    @property
    def regex(self):
        site = _thread_locals.site
        if site:
            # site will be None when 'manage.py show_urls' is invoked.
            return re.compile('^%s/' % site.domain, re.UNICODE)
        return re.compile('^', re.UNICODE)

def site_patterns(prefix, *args):
    pattern_list = patterns(prefix, *args)
    return [OrganizationRegexURLResolver(pattern_list)]

$ cat multitier/middleware.py
...
from django.utils.translation.trans_real import _active
...
class OrganizationMiddleware(object):

    def process_request(self, request):
        _thread_locals.site = get_current_site(request)
        globalpath = os.path.join(os.path.dirname(
                upath(sys.modules[settings.__module__].__file__)), 'locale')
        _active.value = gettext_module.translation(
            'django', globalpath, class_=OrganizationCode)
        return None

$ cat multitier/urls.py
...
urlpatterns = patterns('',
        url(r'^admin/', include(admin.site.urls)),
)

urlpatterns += site_patterns('',
    url(r'^(?P\S+)?', PageView.as_view(), name='multitier_page'),
)

Conclusion

The foremost lesson here is that implementing global persistent caches without a proper CacheManager interface to clear, add, revoke, etc. entries is just evil. Not only it makes it really hard for developers to understand and modify code as necessary, it also has the potential to keep resources locked for a very long time and unbounded growth of memory usage, two scenarios you would rather avoid in a production environment.

There was debate between Flask global request variable and Django request argument approach. In practice, as we have seen here, the entire rendering pipeline must be customizable on a request. Thus, with Django, we end up creating a middleware and a thread local for the times Django thought "you will never need the request here"