Documenting an API implemented with Django Rest Framework
by Sebastien Mirolo on Wed, 24 Oct 2018With growth comes many support issues, one of them being able to efficiently answer questions for developers and prospective customers. The API doc hosted on Read-the-Docs were showing their limitations. So we embark onto surveying the landscape for API documentation, looking for a solution that fits DjaoApp APIs implemented with Django Rest Framework.
Discovery
Stripe has been and is still regarded as one of the best API doc, so we start by searching for "API documentation a la Stripe ". It quickly turns up Ultimate Guide to 30+ API Documentation Solutions. We learn there that there are two major API specification formats on which every tool is based:
- OpenAPI (formely known as Swagger), and
- API Blueprint.
At first it wasn't clear to us the relationship between Swagger and OpenAPI. Older online resources refer to Swagger. Newer resources refer to OpenAPI. Google returns completely different results when you search for one or the other. (On a side note, A list of awesome projects related to OpenAPI 3.0.x is a great resource to get started with OpenAPI.)
There are a lot of glowing recommendations for Slate, maybe because it is the first popular tool in the category. The ecosystem around Slate is significant. Slate though implements its own format, which is neither compatible with OpenAPI nor Blueprint. We found a Django REST Swagger and tools to convert Swagger to Slate, so we started there for our Documentation pipeline:
One of the first advantage of basing DjaoDjin's documentation on DRF schema generator was to find a few issues unreported by pylint.
The resulting HTML was just not up to what we could expect. It looked ugly. A lot of layout and formatting was going to be required. One of the main stumbling block was going to be reStructuredText vs. Markdown.
So far, because the code is written in Python, the doc generated by
Sphinx
and hosted on Read the Docs,
we have been using reStructuredText markups
in docstrings
.
So maybe there is a more Pythonic approach. A few searches later, we stumble upon DRF OpenAPI which is deprecated in favor of drf-yasg. I have to say drf-yasg is listed under Django Rest Framework: Documenting your API - Third party packages. It is just such an horrible name that it took a few hours of searching around the Web to find it.
Running drf-yasg for the first time
We followed the instruction to get started with drf-yasg and browsed the swagger/ and redoc/ URLs. Great, we see something. There is loading screen though. drf-yasg documentation also makes a lot of references to caching. That looks suspicious. Looking under the hood, we see that the rendering pipeline looks like this:
- drf-yasg extends DRF/coreapi to generate a Python OpenAPI schema.
- The Python Schema is transformed into a JSON/YAML OPenAPI file
- The JSON/YAML OPenAPI file is loaded and transformed by redoc or swagger-ui, both Javascript libraries, in the browser
Solving one problem at a time
Tags
The documentation in drf-yasg on tags is light but we find an entry point
through the settings TAGS_SORTER
that leads to code in
drf_yasg/generators.py:
class OpenAPISchemaGenerator(object): ... def get_paths(self, endpoints, components, request, public): ... prefix = self.determine_path_prefix(list(endpoints.keys())) or '' ... for path, (view_cls, methods) in sorted(endpoints.items()): operations = {} for method, view in methods: ... operations[method.lower()] = self.get_operation(view, path, prefix, method, components, request) ... def get_operation(self, view, path, prefix, method, components, request): ... operation_keys = self.get_operation_keys(path[len(prefix):], method, view) operation = view_inspector.get_operation(operation_keys)
and drf_yasg/inspectors/view.py:
class SwaggerAutoSchema(ViewInspector): ... def get_operation(self, operation_keys): ... tags = self.get_tags(operation_keys) ... def get_tags(self, operation_keys): return [operation_keys[0]]
In the code above, we can see that the way tags are generated is pretty crude. It takes the longest common prefix of all URL paths in the API and sets the (single) tag as the first word in-between two slashes ("/") after that.
None-the-less, we tried to live with it and rewrote some of the URL paths
but determine_path_prefix
in rest_framework/schemas/generators.py
still did not generate the expected path prefix and thus expected tags.
class SchemaGenerator(object): ... def determine_path_prefix(self, paths): prefixes = [] for path in paths: components = path.strip('/').split('/') initial_components = [] for component in components: if '{' in component: break initial_components.append(component) prefix = '/'.join(initial_components[:-1]) if not prefix: # We can just break early in the case that there's at least # one URL that doesn't have a path prefix. return '/' prefixes.append('/' + prefix + '/') return common_path(prefixes)
The following code in returned early because of the following URL path:
/api/uploaded-media{path}/
. DjaoDjin has a lot of variable-length
{path}
arguments because it deals with role-based access control
on HTTP request paths.
API returning list of elements
We had defined an URL that returns a list of row headers from a table in a report as such.
url(r'^metrics/lines/(?P%s)/?' % settings.ACCT_REGEX, BalanceLineListAPIView.as_view(), name='saas_api_balance_lines'),
The View inherits from rest_framework.generic.ListCreateAPIView
,
yet drf-yasg stubborn-lessly documents the API as returning a single item.
We have other Views and APIs which apparently are defined in the same way and for which drf-yasg correctly identifies the return value as a list of items.
Another interactive session look through the drf-yasg code base lead to:
def is_list_view(path, method, view): """ Return True if the given path/method appears to represent a list view. """ if hasattr(view, 'action'): # Viewsets have an explicitly defined action, which we can inspect. return view.action == 'list' if method.lower() != 'get': return False if isinstance(view, RetrieveModelMixin): return False path_components = path.strip('/').split('/') # Offending line: if path_components and '{' in path_components[-1]: return False return True
So when the views end with a pattern, they are automatically categorized as single object, irrespective of the actual view they derive from.
Issues with URL override
We integrate multiple Django apps into a single Django project. Sometimes we need to override the default behavior of a single URL endpoint. Typically we do this with the following code:
$ cat signup/urls/api/auth.py: ... url(r'^api/auth/register/', JWTRegister.as_view(), name='api_register'), ... $ cat project/urls.py ... url(r'^api/auth/register/', DjJWTRegister.as_view(), name='api_register'), url(r'^api/', include('signup.urls.api.auth')), ...
In those cases Django will route the URL to the expect (first) view but drf_yasg will generate documentation the overridden View, a behavior we traced back and worked around through the following patch:
class OpenAPISchemaGenerator def get_endpoints(self, request): - for path, method, callback in endpoints: + for path, method, callback in reversed(endpoints):
Missing query parameters
We have some URLs that take query parameters. Example:
/api/profile/?q=name
The query patterns did not show up in the documentation. Until now we relied
on django-extra-views
to insert SearchableMixin
and SortableMixin
.
drf-yasg does not find the query parameters defined through those mixins.
We had to install django-filter,
define custom filters and add them into the views as filter_backends
fields.
$ pip install django-filter $ cat saas/mixins.py: class CartItemSmartListMixin(object): search_fields = ['user__username', 'user__first_name', 'user__last_name', 'user__email'] sort_fields_aliases = [('slug', 'user__username'), ('plan', 'plan'), ('created_at', 'created_at')] filter_backends = (SortableSearchableFilterBackend( sort_fields_aliases, search_fields),)
Real-world API inputs and outputs
When you are defining APIs managing shopping carts and checkout pipelines, you often have APIs that take one type of serializer as input and return a different type of serializer on output.
POST /api/billing//checkout/ { items: [{"reference": "abc", "quatity": 1}] } returns { "created_at": "2016-06-21T23:42:44.270977Z", "processor_key": "pay_5lK5TacFH3gbKe" "amount": 2000, "unit": "usd", "last4": "1234", "exp_date": "2016-06-01", "state": "created" }
These kind of APIs are not handled very well by DRF. APIs that have different behavior on read or write are also not well supported on DRF, though it is common enough some third-party projects extend the Django REST Framework ones adding separated serializers for read and write operations.
None-the-less, from a documentation's perspective, that means we need to
rely on the swagger_auto_schema decorator
to document extra
parameters.
The side-effect of relying on decorators for extra parameters is that now
drf-yasg becomes a required dependency of your code base. This is
an unnecessary problem when you are shipping reusable app. Extra parameters
could have been better handled through fields on the View class. To keep
drf-yasg optional, we created dummy prototypes in a
try: from drf_yasg.utils import swagger_auto_schema except ImportError: from functools import wraps from django.utils.decorators import available_attrs def swagger_auto_schema(function=None, **kwargs): """ Dummy decorator when drf_yasg is not present. """ def decorator(view_func): @wraps(view_func, assigned=available_attrs(view_func)) def _wrapped_view(request, *args, **kwargs): return view_func(request, *args, **kwargs) return _wrapped_view if function: return decorator(function) return decorator
Additional pagination parameters
In a billing statement API, we include the balance due to avoid having two requests going to the backend. It is very unlikely the front-end code requests an history of transactions and not the balance due. The API call looks like:
GET /billing//history/ returns { "balance_amount": 2000, "balance_unit": "usd", results: [{ "created_at": "2017-02-01T00:00:00Z", "description": "Charge for 4 periods", "orig_account": "Liability", "orig_organization": "xia", "orig_amount": 112120, "orig_unit": "usd", "dest_account": "Funds", "dest_organization": "stripe", "dest_amount": 112120, "dest_unit": "usd" }] }
In order to document the fields balance_amount
and balance_unit
, we need to create a Pagination inspector
that derives from drf_yasg.inspectors.DjangoRestResponsePagination
and implements get_paginated_response
.
We then include this documentation paginator into
settings.SWAGGER_SETTINGS
$ cat settings.py: ... SWAGGER_SETTINGS = { 'DEFAULT_PAGINATOR_INSPECTORS': [ 'djaoapp.docs.DocBalancePagination', ...
Write-only parameters
When you are having a registration and login API handing out JWT tokens, you will require write-only parameters.
Unfortunately a design constraint of the Open API spec
does not allow DRF write_only
fields to be handled easily.
Here we are relying on Django templatetags to prevent parameters with a specific suffix (i.e. password, _key) from being generated in the output section of an API.
Converting Docstrings
When we started, the docstrings looked like:
class AccessibleByListAPIView(ListCreateAPIView): """ ``GET`` lists all relations where an ``Organization`` is accessible by a ``User``. Typically the user was granted specific permissions through a ``Role``. ``POST`` Generates a request to attach a user to a role on an organization see :doc:`Flexible Security Framework`. **Example request**: .. code-block:: http GET /api/users/alice/accessibles/ **Example response**: .. code-block:: json { "count": 1, "previous": null, "results": [ { "created_at": "2018-01-01T00:00:00Z", "slug": "cowork", "printable_name": "ABC Corp.", "role_description": "manager", } ] } """
The way coreapi or drf-yasg are extracting documentation for API end points,
we had to create pass-through post()
, etc. methods to attach
documentation to the correct HTTP method.
class AccessibleByListAPIView(ListCreateAPIView): """ Lists all relations where an ``Organization`` is accessible by a ``User``. Typically the user was granted specific permissions through a ``Role``. see :doc:`Flexible Security Framework`. **Examples .. code-block:: http GET /api/users/alice/accessibles/ HTTP/1.1 responds .. code-block:: json { "count": 1, "previous": null, "results": [ { "created_at": "2018-01-01T00:00:00Z", "slug": "cowork", "printable_name": "ABC Corp.", "role_description": "manager", } ] } """ def post(self, request, *args, **kwargs): """ Creates a request to attach a user to a role on an organization """ return super(AccessibleByListAPIView, self).post(request, *args, **kwargs)
reStructuredText TO HTML
We need a way to go from RST to Markdown to HTML, or integrate an RST to HTML formatter in the documentation pipeline. A few searches later ("openapi doc restructedtext", "documenting API with openapi and restructuredtext") lead to a Sphinx extension to generate APIs docs from OpenAPI, Sphinx can use recommonmark. recommonmark is a Docutils bridge to CommonMark-py and A Docutils writer for converting from reStructuredText documents to Markdown.
We tried to load the openapi YAML file, send description to Pandoc to convert reStructured to markdown.
$ sudo port install pandoc $ pandoc $f -f rst -t markdown -o $filename.md` $ echo '``CamelCase`` class' | pandoc -f rst -t markdown
It was not viable. References were escaped, examples removed.
The docutils utility
At this point, we decided to look at the code of drf-yasg
and create our own APIDocView
view with a Jinja2 template that
implements a recursive macro.
We derive a NoHeaderHTMLWriter
from
docutils.writers.html5_polyglot.Writer
to convert from RST
to HTML.
Extracting examples
By creating our own APIDocView
, we also able to parse and separate
a single docstring into a the text description with abstract parameters,
and concrete examples.
We really want text description and code examples to go through separate processing pipeline (rst2html and pygments respectively).
Highlighting HTTP requests
The HTTP request examples would not be highlighted correctly. This required a look into the Pygments source code to realize that the HTTP version needs to be specified for Pygments to parse the request correctly.
# Write in the examples: GET / HTTP/1.1 # instead of GET /
Remove WIP APIs from documentation
With time running out and some APIs not yet fully designed, we remove those Work-in-Progress APIs from the generated OpenAPI schema by adding the following code in their class view.
swagger_schema = None
Django Translation
Localization will also have an impact on documentation and potentially require some code refactoring.
Following the instructions in Django for translations,
we are running python manage.py makemessages
and run in a few error
messages
CommandError: Unable to find a locale path to store translations for file ... Error: Empty dir
After creating the directory where the locale will be store, a few updates
in the
MIDDLEWARE + 'django.middleware.locale.LocaleMiddleware', +LOCALE_PATHS = (os.path.join(BASE_DIR, APP_NAME, 'locale'),)
Testing browser language
, we go into Chrome Settings > Language and move "French to the top of the list.
In the Chrome tools, we inspect and check the Accept-Language
headers.
makemessages does not respect DEBUG settings
makemessages
will go through all gettext
statements
in the code and templates, irrespective of DEBUG=0
or DEBUG=1
.
if DEBUG: ENV_INSTALLED_APPS = ( 'debug_toolbar', 'django_extensions', 'drf_yasg') else: ENV_INSTALLED_APPS = tuple([]) INSTALLED_APPS = ENV_INSTALLED_APPS + ( ...
If your Django settings.py look like the above for example, makemessages will still add messages for debug_toolbar if you happen to have copied them locally in templates/debug_toolbar for some reason.
Format strings
Running the makemessages
command lead to set of warnings.
$ cat decorators.py messages.error(request, _("%s is not a direct manager" of %s.", request.user, organization) $ python manage.py makemessages -l fr The translator cannot reorder the arguments. Please consider using a format string with named arguments, and a mapping instead of a tuple for the arguments.
It makes sense. First the arguments might be re-ordered in translation. A passive voice might make more sense in a language. Using named arguments also helps the translator understand the sentence better out-of-context.
$ cat decorators.py messages.error(request, _("%(user)s is not a direct manager" of %(organization)s.", {'user': request.user, 'organization': organization})
As a guideline, any message to be translated should use format string with named arguments. That greatly helps when translating the message out of context.
Marking gettext strings as safe
It happens that you need to mix gettext
and mark_safe
.
For example, in code like:
terms_of_use = forms.BooleanField( label=mark_safe(_("I agree with terms and conditions") % reverse('legal_terms_of_use')))
Sometimes it works and sometimes, you can be tied into knots trying to do so.
Cleaning up messages
Professional translators usually quote and charge their work by the word. That is one reason to make sure you do not have "password" and "Password" as msgid. Another reason to review messages is to have consistent documentation.
We use the --nowrap command-line flag to write msgid on a single line. It is then straightforward to create an alphabetical list of localized messages.
$ python manage.py makemessages -l pt --symlinks --no-wrap $ grep 'msgid "' djaoapp/locale/pt/LC_MESSAGES/*.po | sort | cut -d '"' -f 2 > msgid.log
Just to make sure we understand the amount of work required from translators, we also count the number of unique words.
$ cat msgid.log | xargs -n1 | sort | uniq -c | wc -l
More to read
The slides version of this post can be found here.
In the search for writing good API documentation, we stumbled upon Documenting APIs, a guide for technical writers, 4 Common Anti-patterns to Avoid in Your Documentation Part 1 and Part 2. All three really worth a read.
If you are looking for more posts on APIs and Django, Django Rest Framework, AngularJS and permissions, Porting a Django app to Jinja2 templates, and Date/time, back and forth between Javascript, Django and PostgreSQL are worth reading next.
More technical posts are also available on the DjaoDjin blog, as well as business lessons we learned running a SaaS application hosting platform.