Welcome to the DjaoDjin Blog!

A place to share experiences in building Software-as-a-Service.

Django, Gunicorn and Syslog-ng

by Sebastien Mirolo on Fri, 29 May 2015

As we add more customers, traffic and infrastructure grow, centralized logging becomes more of a pressing need. We configure the Gunicorn/Django service to log messages through the local syslog-ng and have syslog-ng forwards logs to an aggregator machine.

Configuring Gunicorn

Gunicorn 18.0 support for syslog was flaky, Gunicorn 19.0 was broken (see #787). We will use a recent version of Gunicorn (i.e. version 19.3.0) here.

First in Gunicorn Settings, we find a few parameters related to syslog logging:

NameDescription
syslogTrue or False. Send Gunicorn logs to syslog.
syslog_addrAddress to send syslog messages.
syslog_facilitySyslog facility name
syslog_prefixprogram-name in the syslog entries.
logger_classThe logger you want to use to log events in Gunicorn.
logconfigStandard Python logging module’s Configuration file format.

Enabling syslog is as simple as adding syslog=True into the configuration file.


$ diff -u prev gunicorn.conf
+syslog=True

Writing syslog events to UDP port

I read through the Gunicorn code base and tried various alternatives:


# first attempt:
syslog_addr="/run/systemd/journal/syslog"
# second attempt:
syslog_addr="unix:///run/systemd/journal/syslog#dgram"
# third attempt:
syslog_addr="unix:///dev/log#dgram"

None of them were successful to have Gunicorn send messages to syslog. Either it is a problem in my configuration, an issue in Gunicorn or some strange interaction with journald, I haven't found out yet. Configuring syslog-ng to listen on port 514 gets the message across.

Logging Client Address

With Gunicorn >= 19.0 we need to use %({X-Forwarded-For}i)s instead of %(h)s because Gunicorn will set REMOTE_ADDR to "" (see GitHub issue #797). We do just that, otherwise the IP address of the client calling our server is lost.


$ diff -u prev gunicorn.conf
-access_log_format='%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
+access_log_format='%({X-Forwarded-For}i)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'

Configuring syslog-ng

Since we are running on Fedora 21, we will need to first install syslog-ng and configure journald to send events to syslog.


$ diff -u prev /etc/systemd/journald.conf
-#ForwardToSyslog=no
+ForwardToSyslog=yes
$ sudo systemctl restart systemd-journald

Once we get journald somewhat out of interfering with our settings, it is time to dig into the syslog-ng quickstart. Syslog-ng is pretty straightforward to configure.

Here we configure syslog-ng to send events over TCP with TLS for two reasons:

  • We want to be sure all records arrive to the aggregator.
  • We are running in a public cloud environment.
Since we have a very dynamic web tier, we don't require the client to be verified (optional-untrusted). We though authorize access to the aggregator through Security Groups and firewall rules on the aggregator itself.

First we will thus create a self-signed certificate stored on the aggregator.


# Create private key and Certificate Signing Request (CSR)
$ sudo openssl req -new -newkey rsa:2048 -nodes \
    -keyout /etc/pki/tls/private/aggregator.key \
    -out /etc/pki/tls/certs/aggregator.csr
Common Name (eg, your name or your server's hostname) []: aggregator.internal

# Self-signing
$ openssl x509 -req -days 365 -in /etc/pki/tls/certs/aggregator.csr \
    -signkey /etc/pki/tls/private/aggregator.key \
    -out /etc/pki/tls/certs/aggregator.crt

On the Web server

We create two destinations d_gunicorn_network and d_gunicorn_local to log messages locally and forward them to the remote aggregator. This way we are able to check the connection between Gunicorn and syslog-ng and the connection between the local syslog-ng and remote syslog-ng separately.


# First we copy the self-signed public certificate in place.
$ sudo cp aggregator.crt /etc/pki/tls/certs
$ cd /etc/pki/tls/certs
$ openssl x509 -noout -hash -in aggregator.crt
hash
$ sudo ln -s aggregator.crt hash.0

Depending on how you copied the aggregator.crt file in place, the SELinux attributes might be incorrect. We fix this by issuing the command:


$ sudo restorecon -R -v /etc/pki/tls/certs

We are moving on to configuring syslog-ng.


$ diff -u prev /etc/syslog-ng/syslog-ng.conf
source s_sys {
    system();
    internal();
-    # udp(ip(0.0.0.0) port(514));
+    udp(ip(127.0.0.1) port(514));
};

# Prevent messages to go to both, /var/log/messages and /var/log/gunicorn.
filter f_default    { level(info..emerg) and
                        not (facility(mail)
                        or facility(authpriv)
-                       or facility(cron)); };
+                       or facility(cron)
+                       or program(gunicorn.*)); };

$ cat /etc/syslog-ng/conf.d/gunicorn.conf
destination d_gunicorn_network {
    network(
        "aggregator.internal"
        port(2222)
        transport("tls")
        tls(ca-dir("/etc/pki/tls/certs"))
    );
};
destination d_gunicorn_local { file("/var/log/gunicorn"); };
filter f_gunicorn { program(gunicorn.*); };
log {
    source(s_sys); filter(f_gunicorn);
    destination(d_gunicorn_local); destination(d_gunicorn_network);
};

Everything looks fine and we are able to see the events appearing in the local file /var/log/gunicorn. None-the-less, as a sanity check, it is always good to look through the SELinux audit log. Many strange problems discovered later on often stem from misconfigured SELinux permissions.


$ sudo ausearch -ts today -i | grep 'denied.*comm=syslog-ng' | audit2why
... avc:  denied  { name_connect } dest=2222 ...

So SELinux won't allow syslog-ng to connect to port 2222 on the remote machine (the port remote syslog-ng will listen on). That would definitely be a problem later on.

I tried to create a fine fine grained custom policy as usual:


$ sudo ausearch -ts today -i | grep 'denied.*comm=syslog-ng' | audit2allow -M syslog-ng
$ sudo semodule -i syslog-ng.pp

... but that didn't work. I then ran various commands to figure this out:


$ sudo yum install setools-console
$ sudo sesearch --allow -s syslogd_t | grep tcp | grep name_connect
...
$ sudo semanage port -l | grep syslogd_port_t
syslogd_port_t                 tcp      601
syslogd_port_t                 udp      514, 601

Everything look good but no luck, name_connect still denied. Finally only setting nis_enabled got the errors to go away and the local syslog-ng to connect to the remote aggregator.


$ sudo setsebool -P nis_enabled 1

On the Aggregator

Getting SELinux to cooperate on the web machine was frustrating but nothing compared to what it would be on the aggregator server.

In less than 5 minutes the modification to syslog-ng were in place. We change keep_hostmane to no here because we want to keep track of which web machine sent the syslog events.


# Configure syslog-ng
$ diff -u prev /etc/syslog-ng/syslog-ng.conf
-    keep_hostname (yes);
+    keep_hostname (no);

+source s_network {
+    network(
+        transport("tls")
+        port(2222)
+        tls(peer-verify("optional-untrusted")
+            key-file("/etc/pki/tls/private/aggregator.key")
+            cert-file("/etc/pki/tls/certs/aggregator.crt"))
+    );
};

$ cat /etc/syslog-ng/conf.d/gunicorn.conf
destination d_gunicorn { file("/var/log/gunicorn"); };
filter f_gunicorn { program(gunicorn.*); };
log { source(s_network); filter(f_gunicorn); destination(d_gunicorn); };

$ sudo touch /var/log/gunicorn
$ sudo systemctl restart syslog-ng.service

The syslog-ng.service fails to restart.


$ sudo systemctl status syslog-ng.service
start request repeated too quickly for syslog-ng.service

Very helpful. Thanks. The problem is most likely unrelated to anything done "too quickly" and much to do with our old friend SELinux.


$ sudo ausearch -ts today -i | grep 'denied.*comm=syslog-ng' | audit2why

Nothing... I could have swear. Alright, let's try to run from the command line directly.


/usr/sbin/syslog-ng -Fevd -p /var/run/syslogd.pid

The syslog-ng daemon works perfectly then. Now that is baffling. It works, no issue from the command line, but not when run through systemd. We encountered similar issues while setting up Jenkins. It was some LXC and SELinux issues then... Let's dig further.


$ diff -u prev /etc/systemd/system/syslog.service
-ExecStart=/usr/sbin/syslog-ng -F -p /var/run/syslogd.pid
-StandardOutput=null
+ExecStart=/usr/sbin/syslog-ng -Fevd -p /var/run/syslogd.pid
+#StandardOutput=null

$ sudo systemctl daemon-reload
$ sudo systemctl start syslog-ng

There is nothing in the log files. Unfortunately we don't have much choice but to start browsing through the journalctl man pages.


$ journalctl _SYSTEMD_UNIT=syslog-ng.service
syslog-ng: Error binding socket; addr='AF_INET(0.0.0.0:2222)', error='Permission denied (13)'

So it is a permission issue after all! Why is it not showing in /var/log/audit/audit.log?


$ journalctl _SYSTEMD_UNIT=auditd.service
auditd: Audit daemon is low on disk space for logging
auditd: Audit daemon is suspending logging due to low disk space.

That's it! We cleared some space but auditd never restarted writing logs by itself.


$ sudo systemctl restart auditd
Failed to restart auditd.service: Operation refused, unit auditd.service may be requested by dependency only.
$ sudo service auditd resume

$ journalctl _SYSTEMD_UNIT=auditd.service
auditd: Audit daemon is suspending logging due to low disk space.
auditd: Audit daemon is attempting to resume logging.

$ sudo systemctl start syslog
$ sudo grep 'denied.*comm="syslog-ng"' /var/log/audit/audit.log
avc:  denied  { name_bind } comm="syslog-ng" src=2222
$ sudo ausearch -ts today -i | grep 'denied.*comm=syslog-ng' | audit2allow -M syslog-ng
$ sudo semodule -i syslog-ng.pp

Surprisingly, on the aggregator machine, installing a custom SELinux policy worked to let syslog-ng bind to the port it is listening on.

It only remains to make sure ports are open in the aggregator firewall.


$ diff -u prev /etc/sysconfig/iptables
 -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
+-A INPUT -p tcp -m tcp --dport 2222 -j ACCEPT

$ sudo /usr/sbin/ifup-local

Since we are running on AWS, we also need to configure the security groups such that the web tier can forward logs to the aggregator.

Configuring Django

Now that we have Gunicorn sending log records to the remote aggregator through syslog, it is time to configure Django to also send the applications log records to the remote aggregator.

First, let's read the documentation about the SysLogHandler, then do some experiment with the Python interactive shell.


$ python
>>> import logging, logging.config
>>> logging.config.dictConfig({
    'version': 1,
    'handlers': {
        'syslog':{
            'level':'DEBUG',
            'class':'logging.handlers.SysLogHandler'}},
    'loggers': {
        '': {
            'handlers': ['syslog'],
            'level': 'DEBUG',
            'propagate': False
    }}})
>>> logging.info("HELLO")

$ tail /var/log/messages
May 27 17:51:21 127.0.0.1 HELLO

The documentation indicates that by default, SysLogHandler will send records as UDP packets to localhost:514, which fortunately we configured previously syslog-ng to listen on.

There is no magic formatting. Syslog does not automatically add the processName nor the processID. Python logging provides a processName variable but it invariably reads MainProcess in the syslog output. Thus, we will just hand craft the program name to match gunicorn.APP_NAME.app.


$ cat django_project/settings.py
...
LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        'request': {
            '()': 'django_requestlogging.logging_filters.RequestFilter',
        },
    },
     'formatters': {
        'syslog_format': {
            'format':
'gunicorn.' + APP_NAME + '.app: [%(process)d] '\
'%(remote_addr)s %(username)s %(levelname)s [%(asctime)s] %(message)s',
            'datefmt': '%d/%b/%Y:%H:%M:%S %z'
        },
    },
    'handlers': {
        'syslog': {
            'level':'DEBUG',
            'formatter': 'syslog_format',
            'filters': ['request'],
            'class':'logging.handlers.SysLogHandler',
        },
    },
    'loggers': {
        '': {
            'handlers': ['syslog'],
            'level': 'DEBUG',
            'propagate': False
        }
    }
}
...

Here we are using the django-requestlogging to prefix all log messages with a remote_addr and username (when the request.user is authenticated).

Running further tests shows that remote_addr is consistently 127.0.0.1 which is consistent with the previously discussed Gunicorn changes (GitHub #797).

In the old times putting django.middleware.http.SetRemoteAddrFromForwardedFor in front of django_requestlogging.middleware.LogSetupMiddleware in MIDDLEWARE_CLASSES would have solved the problem. Unfortunately, this very useful middleware was deprecated in Django 1.1 and subsequently removed. We just need to re-introduce a version of it into our own project code base.


$ cat django_project/middleware.py

class SetRemoteAddrFromForwardedFor(object):
    """
    set REMOTE_ADDR based on HTTP_X_FORWARDED_FOR.
    """

    @staticmethod
    def process_request(request):
        if request.META.get('REMOTE_ADDR', '127.0.0.1') == '127.0.0.1':
            request.META.update({'REMOTE_ADDR':
                request.META.get('HTTP_X_FORWARDED_FOR', '127.0.0.1')})
        return None

$ diff -u django_project/settings.py
MIDDLEWARE_CLASSES += (
...
+    'djaodjin.middleware.SetRemoteAddrFromForwardedFor',
     'django_requestlogging.middleware.LogSetupMiddleware',
)

At this point we are able to check the logs are making their way to the aggregator with the correct structure and IP addresses collected along the way. On each line we have in order:

  • syslog date
  • machine that initiated the log record
  • gunicorn.app_name.{access|error|app}
  • PID (in brackets)
  • Remote client IP
  • username (or a minus sign when unavailable)
  • log level (or a minus sign when unavailable)
  • log message

Logrotate

Finally we need to add our logs to the log rotation schedule on both, the web server and log aggregator.


diff -u /etc/logrotate.d/syslog
+/var/log/gunicorn

Just one more workaround to insure that logs are rotated correctly, et voila!


diff -u /etc/logrotate.d/syslog
+missingok

More to read

You might also like to read Multi-tier Implementation in Django or Single sign-on through OpenLDAP next.

More technical posts are also available on the DjaoDjin blog, as well as business lessons we learned running a subscription hosting platform.