Welcome to the DjaoDjin Blog!

A place to share experiences in building Software-as-a-Service.

Documenting an API implemented with Django Rest Framework

by Sebastien Mirolo on Wed, 24 Oct 2018

With growth comes many support issues, one of them being able to efficiently answer questions for developers and prospective customers. The API doc hosted on Read-the-Docs were showing their limitations. So we embark onto surveying the landscape for API documentation, looking for a solution that fits DjaoDjin APIs implemented with Django Rest Framework.

Discovery

Stripe has been and is still regarded as one of the best API doc, so we start by searching for "API documentation a la Stripe ". It quickly turns up Ultimate Guide to 30+ API Documentation Solutions. We learn there that there are two major API specification formats on which every tool is based:

At first it wasn't clear to us the relationship between Swagger and OpenAPI. Older online resources refer to Swagger. Newer resources refer to OpenAPI. Google returns completely different results when you search for one or the other. (On a side note, A list of awesome projects related to OpenAPI 3.0.x is a great resource to get started with OpenAPI.)

There are a lot of glowing recommendations for Slate, maybe because it is the first popular tool in the category. The ecosystem around Slate is significant. Slate though implements its own format, which is neither compatible with OpenAPI nor Blueprint. We found a Django REST Swagger and tools to convert Swagger to Slate, so we started there for our Documentation pipeline:

One of the first advantage of basing DjaoDjin's documentation on DRF schema generator was to find a few issues unreported by pylint.

The resulting HTML was just not up to what we could expect. It looked ugly. A lot of layout and formatting was going to be required. One of the main stumbling block was going to be reStructuredText vs. Markdown.

So far, because the code is written in Python, the doc generated by Sphinx and hosted on Read the Docs, we have been using reStructuredText markups in docstrings.

So maybe there is a more Pythonic approach. A few searches later, we stumble upon DRF OpenAPI which is deprecated in favor of drf-yasg. I have to say drf-yasg is listed under Django Rest Framework: Documenting your API - Third party packages. It is just such an horrible name that it took a few hours of searching around the Web to find it.

Running drf-yasg for the first time

We followed the instruction to get started with drf-yasg and browsed the swagger/ and redoc/ URLs. Great, we see something. There is loading screen though. drf-yasg documentation also makes a lot of references to caching. That looks suspicious. Looking under the hood, we see that the rendering pipeline looks like this:

  1. drf-yasg extends DRF/coreapi to generate a Python OpenAPI schema.
  2. The Python Schema is transformed into a JSON/YAML OPenAPI file
  3. The JSON/YAML OPenAPI file is loaded and transformed by redoc or swagger-ui, both Javascript libraries, in the browser

Solving one problem at a time

Tags

The documentation in drf-yasg on tags is light but we find an entry point through the settings TAGS_SORTER that leads to code in drf_yasg/generators.py:


class OpenAPISchemaGenerator(object):
...
    def get_paths(self, endpoints, components, request, public):
...
        prefix = self.determine_path_prefix(list(endpoints.keys())) or ''
...
        for path, (view_cls, methods) in sorted(endpoints.items()):
            operations = {}
            for method, view in methods:
...
                operations[method.lower()] = self.get_operation(view, path, prefix, method, components, request)
...
    def get_operation(self, view, path, prefix, method, components, request):
...
        operation_keys = self.get_operation_keys(path[len(prefix):], method, view)
        operation = view_inspector.get_operation(operation_keys)

and drf_yasg/inspectors/view.py:


class SwaggerAutoSchema(ViewInspector):
...
    def get_operation(self, operation_keys):
...
        tags = self.get_tags(operation_keys)
...
    def get_tags(self, operation_keys):
        return [operation_keys[0]]

In the code above, we can see that the way tags are generated is pretty crude. It takes the longest common prefix of all URL paths in the API and sets the (single) tag as the first word in-between two slashes ("/") after that.

None-the-less, we tried to live with it and rewrote some of the URL paths but determine_path_prefix in rest_framework/schemas/generators.py still did not generate the expected path prefix and thus expected tags.

class SchemaGenerator(object):
...
    def determine_path_prefix(self, paths):
        prefixes = []
        for path in paths:
            components = path.strip('/').split('/')
            initial_components = []
            for component in components:
                if '{' in component:
                    break
                initial_components.append(component)
            prefix = '/'.join(initial_components[:-1])
            if not prefix:
                # We can just break early in the case that there's at least
                # one URL that doesn't have a path prefix.
                return '/'
            prefixes.append('/' + prefix + '/')
        return common_path(prefixes)

The following code in returned early because of the following URL path: /api/uploaded-media{path}/. DjaoDjin has a lot of variable-length {path} arguments because it deals with role-based access control on HTTP request paths.

API returning list of elements

We had defined an URL that returns a list of row headers from a table in a report as such.

    url(r'^metrics/lines/(?P%s)/?' % settings.ACCT_REGEX,
        BalanceLineListAPIView.as_view(), name='saas_api_balance_lines'),

The View inherits from rest_framework.generic.ListCreateAPIView, yet drf-yasg stubborn-lessly documents the API as returning a single item.

We have other Views and APIs which apparently are defined in the same way and for which drf-yasg correctly identifies the return value as a list of items.

Another interactive session look through the drf-yasg code base lead to:

def is_list_view(path, method, view):
    """
    Return True if the given path/method appears to represent a list view.
    """
    if hasattr(view, 'action'):
        # Viewsets have an explicitly defined action, which we can inspect.
        return view.action == 'list'

    if method.lower() != 'get':
        return False
    if isinstance(view, RetrieveModelMixin):
        return False
    path_components = path.strip('/').split('/')

    # Offending line:

    if path_components and '{' in path_components[-1]:
        return False

    return True

So when the views end with a pattern, they are automatically categorized as single object, irrespective of the actual view they derive from.

Issues with URL override

We integrate multiple Django apps into a single Django project. Sometimes we need to override the default behavior of a single URL endpoint. Typically we do this with the following code:

$ cat signup/urls/api/auth.py:
...
    url(r'^api/auth/register/', JWTRegister.as_view(), name='api_register'),
...

$ cat project/urls.py
...
    url(r'^api/auth/register/', DjJWTRegister.as_view(), name='api_register'),
    url(r'^api/', include('signup.urls.api.auth')),
...

In those cases Django will route the URL to the expect (first) view but drf_yasg will generate documentation the overridden View, a behavior we traced back and worked around through the following patch:

class OpenAPISchemaGenerator
    def get_endpoints(self, request):

-        for path, method, callback in endpoints:
+        for path, method, callback in reversed(endpoints):

Missing query parameters

We have some URLs that take query parameters. Example:

/api/profile/?q=name

The query patterns did not show up in the documentation. Until now we relied on django-extra-views to insert SearchableMixin and SortableMixin. drf-yasg does not find the query parameters defined through those mixins. We had to install django-filter, define custom filters and add them into the views as filter_backends fields.

$ pip install django-filter
$ cat saas/mixins.py:

class CartItemSmartListMixin(object):

    search_fields = ['user__username',
                     'user__first_name',
                     'user__last_name',
                     'user__email']

    sort_fields_aliases = [('slug', 'user__username'),
                           ('plan', 'plan'),
                           ('created_at', 'created_at')]

    filter_backends = (SortableSearchableFilterBackend(
        sort_fields_aliases, search_fields),)

Real-world API inputs and outputs

When you are defining APIs managing shopping carts and checkout pipelines, you often have APIs that take one type of serializer as input and return a different type of serializer on output.

POST /api/billing//checkout/
{
   items: [{"reference": "abc", "quatity": 1}]
}

returns

{
  "created_at": "2016-06-21T23:42:44.270977Z",
  "processor_key": "pay_5lK5TacFH3gbKe"
  "amount": 2000,
  "unit": "usd",
  "last4": "1234",
  "exp_date": "2016-06-01",
  "state": "created"
}

These kind of APIs are not handled very well by DRF. APIs that have different behavior on read or write are also not well supported on DRF, though it is common enough some third-party projects extend the Django REST Framework ones adding separated serializers for read and write operations.

None-the-less, from a documentation's perspective, that means we need to rely on the swagger_auto_schema decorator to document extra parameters.

The side-effect of relying on decorators for extra parameters is that now drf-yasg becomes a required dependency of your code base. This is an unnecessary problem when you are shipping reusable app. Extra parameters could have been better handled through fields on the View class. To keep drf-yasg optional, we created dummy prototypes in a compat.py file and imported those.


try:
    from drf_yasg.utils import swagger_auto_schema
except ImportError:
    from functools import wraps
    from django.utils.decorators import available_attrs

    def swagger_auto_schema(function=None, **kwargs):
        """
        Dummy decorator when drf_yasg is not present.
        """
        def decorator(view_func):
            @wraps(view_func, assigned=available_attrs(view_func))
            def _wrapped_view(request, *args, **kwargs):
                return view_func(request, *args, **kwargs)
            return _wrapped_view

        if function:
            return decorator(function)
        return decorator

Additional pagination parameters

In a billing statement API, we include the balance due to avoid having two requests going to the backend. It is very unlikely the front-end code requests an history of transactions and not the balance due. The API call looks like:

GET /billing//history/
returns
{
  "balance_amount": 2000,
  "balance_unit": "usd",
  results: [{
    "created_at": "2017-02-01T00:00:00Z",
    "description": "Charge for 4 periods",
    "orig_account": "Liability",
    "orig_organization": "xia",
    "orig_amount": 112120,
    "orig_unit": "usd",
    "dest_account": "Funds",
    "dest_organization": "stripe",
    "dest_amount": 112120,
    "dest_unit": "usd"
  }]
}

In order to document the fields balance_amount and balance_unit, we need to create a Pagination inspector that derives from drf_yasg.inspectors.DjangoRestResponsePagination and implements get_paginated_response.

We then include this documentation paginator into settings.SWAGGER_SETTINGS


$ cat settings.py:
...
SWAGGER_SETTINGS = {
    'DEFAULT_PAGINATOR_INSPECTORS': [
        'djaoapp.docs.DocBalancePagination',
...

Write-only parameters

When you are having a registration and login API handing out JWT tokens, you will require write-only parameters.

Unfortunately a design constraint of the Open API spec does not allow DRF write_only fields to be handled easily.

Here we are relying on Django templatetags to prevent parameters with a specific suffix (i.e. password, _key) from being generated in the output section of an API.

Converting Docstrings

When we started, the docstrings looked like:

class AccessibleByListAPIView(ListCreateAPIView):
    """
    ``GET`` lists all relations where an ``Organization`` is accessible by
     a ``User``. Typically the user was granted specific permissions through
     a ``Role``.

    ``POST`` Generates a request to attach a user to a role on an organization

     see :doc:`Flexible Security Framework `.

     **Example request**:

    .. code-block:: http

        GET  /api/users/alice/accessibles/

    **Example response**:

    .. code-block:: json

         {
             "count": 1,
             "previous": null,
             "results": [
                 {
                    "created_at": "2018-01-01T00:00:00Z",
                     "slug": "cowork",
                     "printable_name": "ABC Corp.",
                     "role_description": "manager",
                  }
             ]
         }
      """

The way coreapi or drf-yasg are extracting documentation for API end points, we had to create pass-through post(), etc. methods to attach documentation to the correct HTTP method.

class AccessibleByListAPIView(ListCreateAPIView):
     """
     Lists all relations where an ``Organization`` is accessible by
     a ``User``. Typically the user was granted specific permissions through
     a ``Role``.

     see :doc:`Flexible Security Framework `.

     **Examples

    .. code-block:: http

        GET  /api/users/alice/accessibles/ HTTP/1.1

    responds

    .. code-block:: json

         {
             "count": 1,
             "previous": null,
             "results": [
                 {
                    "created_at": "2018-01-01T00:00:00Z",
                     "slug": "cowork",
                     "printable_name": "ABC Corp.",
                     "role_description": "manager",
                  }
             ]
         }
    """

    def post(self, request, *args, **kwargs):
       """
       Creates a request to attach a user to a role on an organization
       """
       return super(AccessibleByListAPIView, self).post(request, *args, **kwargs)

reStructuredText TO HTML

We need a way to go from RST to Markdown to HTML, or integrate an RST to HTML formatter in the documentation pipeline. A few searches later ("openapi doc restructedtext", "documenting API with openapi and restructuredtext") lead to a Sphinx extension to generate APIs docs from OpenAPI, Sphinx can use recommonmark. recommonmark is a Docutils bridge to CommonMark-py and A Docutils writer for converting from reStructuredText documents to Markdown.

We tried to load the openapi YAML file, send description to Pandoc to convert reStructured to markdown.

$ sudo port install pandoc
$ pandoc $f -f rst -t markdown -o $filename.md`
$ echo '``CamelCase`` class' | pandoc -f rst -t markdown

It was not viable. References were escaped, examples removed.

The docutils utility rst2html produces a full HTML file, i.e. with the <html><head><body></body></body></html> structure. We need to dig deeper to produce rst2html snipset.

At this point, we decided to look at the code of drf-yasg and create our own APIDocView view with a Jinja2 template that implements a recursive macro. We derive a NoHeaderHTMLWriter from docutils.writers.html5_polyglot.Writer to convert from RST to HTML.

Extracting examples

By creating our own APIDocView, we also able to parse and separate a single docstring into a the text description with abstract parameters, and concrete examples.

We really want text description and code examples to go through separate processing pipeline (rst2html and pygments respectively).

Highlighting HTTP requests

The HTTP request examples would not be highlighted correctly. This required a look into the Pygments source code to realize that the HTTP version needs to be specified for Pygments to parse the request correctly.

# Write in the examples:
GET / HTTP/1.1
# instead of
GET /

Remove WIP APIs from documentation

With time running out and some APIs not yet fully designed, we remove those Work-in-Progress APIs from the generated OpenAPI schema by adding the following code in their class view.

    swagger_schema = None

Django Translation

Localization will also have an impact on documentation and potentially require some code refactoring.

Following the instructions in Django for translations, we are running python manage.py makemessages and run in a few error messages

CommandError: Unable to find a locale path to store translations for file ...
Error: Empty dir

After creating the directory where the locale will be store, a few updates in the settings.py take care of the first stumbling steps:


MIDDLEWARE
+    'django.middleware.locale.LocaleMiddleware',

+LOCALE_PATHS = (os.path.join(BASE_DIR, APP_NAME, 'locale'),)

Testing browser language

, we go into Chrome Settings > Language and move "French to the top of the list.

In the Chrome tools, we inspect and check the Accept-Language headers.

makemessages does not respect DEBUG settings

makemessages will go through all gettext statements in the code and templates, irrespective of DEBUG=0 or DEBUG=1.

if DEBUG:
    ENV_INSTALLED_APPS = (
        'debug_toolbar',
        'django_extensions',
        'drf_yasg')
else:
    ENV_INSTALLED_APPS = tuple([])

INSTALLED_APPS = ENV_INSTALLED_APPS + (
...

If your Django settings.py look like the above for example, makemessages will still add messages for debug_toolbar if you happen to have copied them locally in templates/debug_toolbar for some reason.

Format strings

Running the makemessages command lead to set of warnings.

$ cat decorators.py
    messages.error(request,
        _("%s is not a direct manager" of %s.",
        request.user, organization)

$ python manage.py makemessages -l fr

The translator cannot reorder the arguments.
Please consider using a format string with named arguments,
 and a mapping instead of a tuple for the arguments.

It makes sense. First the arguments might be re-ordered in translation. A passive voice might make more sense in a language. Using named arguments also helps the translator understand the sentence better out-of-context.


$ cat decorators.py
    messages.error(request,
        _("%(user)s is not a direct manager" of %(organization)s.",
        {'user': request.user, 'organization': organization})

As a guideline, any message to be translated should use format string with named arguments. That greatly helps when translating the message out of context.

Marking gettext strings as safe

It happens that you need to mix gettext and mark_safe. For example, in code like:

    terms_of_use = forms.BooleanField(
       label=mark_safe(_("I agree with terms and conditions")
       % reverse('legal_terms_of_use')))

Sometimes it works and sometimes, you can be tied into knots trying to do so.

Cleaning up messages

Professional translators usually quote and charge their work by the word. That is one reason to make sure you do not have "password" and "Password" as msgid. Another reason to review messages is to have consistent documentation.

We use the --nowrap command-line flag to write msgid on a single line. It is then straightforward to create an alphabetical list of localized messages.

$ python manage.py makemessages -l pt --symlinks --no-wrap
$ grep 'msgid "' djaoapp/locale/pt/LC_MESSAGES/*.po | sort | cut -d '"' -f 2 > msgid.log

Just to make sure we understand the amount of work required from translators, we also count the number of unique words.

$ cat msgid.log | xargs -n1 | sort | uniq -c | wc -l

More to read

In the search for writing good API documentation, we stumbled upon Documenting APIs, a guide for technical writers, 4 Common Anti-patterns to Avoid in Your Documentation Part 1 and Part 2. All three really worth a read.

If you are looking for more posts on APIs and Django, Django Rest Framework, AngularJS and permissions, Porting a Django app to Jinja2 templates, and Date/time, back and forth between Javascript, Django and PostgreSQL are worth reading next.

More technical posts are also available on the DjaoDjin blog, as well as business lessons we learned running a subscription hosting platform.