Django, Gunicorn and Syslog-ng
by Sebastien Mirolo on Fri, 29 May 2015As we add more customers, traffic and infrastructure grow, centralized logging becomes more of a pressing need. We configure the Gunicorn/Django service to log messages through the local syslog-ng and have syslog-ng forwards logs to an aggregator machine.
Configuring Gunicorn
Gunicorn 18.0 support for syslog was flaky, Gunicorn 19.0 was broken (see #787). We will use a recent version of Gunicorn (i.e. version 19.3.0) here.
First in Gunicorn Settings, we find a few parameters related to syslog logging:
Name | Description |
---|---|
syslog | True or False. Send Gunicorn logs to syslog. |
syslog_addr | Address to send syslog messages. |
syslog_facility | Syslog facility name |
syslog_prefix | program-name in the syslog entries. |
logger_class | The logger you want to use to log events in Gunicorn. |
logconfig | Standard Python logging module’s Configuration file format. |
Enabling syslog is as simple as adding syslog=True
into the configuration file.
$ diff -u prev gunicorn.conf +syslog=True
Writing syslog events to UDP port
I read through the Gunicorn code base and tried various alternatives:
# first attempt: syslog_addr="/run/systemd/journal/syslog" # second attempt: syslog_addr="unix:///run/systemd/journal/syslog#dgram" # third attempt: syslog_addr="unix:///dev/log#dgram"
None of them were successful to have Gunicorn send messages to syslog. Either it is a problem in my configuration, an issue in Gunicorn or some strange interaction with journald, I haven't found out yet. Configuring syslog-ng to listen on port 514 gets the message across.
Logging Client Address
With Gunicorn >= 19.0 we need to use %({X-Forwarded-For}i)s instead of %(h)s because Gunicorn will set REMOTE_ADDR to "" (see GitHub issue #797). We do just that, otherwise the IP address of the client calling our server is lost.
$ diff -u prev gunicorn.conf -access_log_format='%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"' +access_log_format='%({X-Forwarded-For}i)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
Configuring syslog-ng
Since we are running on Fedora 21, we will need to first install syslog-ng and configure journald to send events to syslog.
$ diff -u prev /etc/systemd/journald.conf -#ForwardToSyslog=no +ForwardToSyslog=yes $ sudo systemctl restart systemd-journald
Once we get journald somewhat out of interfering with our settings, it is time to dig into the syslog-ng quickstart. Syslog-ng is pretty straightforward to configure.
Here we configure syslog-ng to send events over TCP with TLS for two reasons:
- We want to be sure all records arrive to the aggregator.
- We are running in a public cloud environment.
First we will thus create a self-signed certificate stored on the aggregator.
# Create private key and Certificate Signing Request (CSR) $ sudo openssl req -new -newkey rsa:2048 -nodes \ -keyout /etc/pki/tls/private/aggregator.key \ -out /etc/pki/tls/certs/aggregator.csr Common Name (eg, your name or your server's hostname) []: aggregator.internal # Self-signing $ openssl x509 -req -days 365 -in /etc/pki/tls/certs/aggregator.csr \ -signkey /etc/pki/tls/private/aggregator.key \ -out /etc/pki/tls/certs/aggregator.crt
On the Web server
We create two destinations d_gunicorn_network and d_gunicorn_local to log messages locally and forward them to the remote aggregator. This way we are able to check the connection between Gunicorn and syslog-ng and the connection between the local syslog-ng and remote syslog-ng separately.
# First we copy the self-signed public certificate in place. $ sudo cp aggregator.crt /etc/pki/tls/certs $ cd /etc/pki/tls/certs $ openssl x509 -noout -hash -in aggregator.crt hash $ sudo ln -s aggregator.crt hash.0
Depending on how you copied the aggregator.crt file in place, the SELinux attributes might be incorrect. We fix this by issuing the command:
$ sudo restorecon -R -v /etc/pki/tls/certs
We are moving on to configuring syslog-ng.
$ diff -u prev /etc/syslog-ng/syslog-ng.conf source s_sys { system(); internal(); - # udp(ip(0.0.0.0) port(514)); + udp(ip(127.0.0.1) port(514)); }; # Prevent messages to go to both, /var/log/messages and /var/log/gunicorn. filter f_default { level(info..emerg) and not (facility(mail) or facility(authpriv) - or facility(cron)); }; + or facility(cron) + or program(gunicorn.*)); }; $ cat /etc/syslog-ng/conf.d/gunicorn.conf destination d_gunicorn_network { network( "aggregator.internal" port(2222) transport("tls") tls(ca-dir("/etc/pki/tls/certs")) ); }; destination d_gunicorn_local { file("/var/log/gunicorn"); }; filter f_gunicorn { program(gunicorn.*); }; log { source(s_sys); filter(f_gunicorn); destination(d_gunicorn_local); destination(d_gunicorn_network); };
Everything looks fine and we are able to see the events appearing in the local file /var/log/gunicorn. None-the-less, as a sanity check, it is always good to look through the SELinux audit log. Many strange problems discovered later on often stem from misconfigured SELinux permissions.
$ sudo ausearch -ts today -i | grep 'denied.*comm=syslog-ng' | audit2why ... avc: denied { name_connect } dest=2222 ...
So SELinux won't allow syslog-ng to connect to port 2222 on the remote machine (the port remote syslog-ng will listen on). That would definitely be a problem later on.
I tried to create a fine fine grained custom policy as usual:
$ sudo ausearch -ts today -i | grep 'denied.*comm=syslog-ng' | audit2allow -M syslog-ng $ sudo semodule -i syslog-ng.pp
... but that didn't work. I then ran various commands to figure this out:
$ sudo yum install setools-console $ sudo sesearch --allow -s syslogd_t | grep tcp | grep name_connect ... $ sudo semanage port -l | grep syslogd_port_t syslogd_port_t tcp 601 syslogd_port_t udp 514, 601
Everything look good but no luck, name_connect still denied. Finally only setting nis_enabled got the errors to go away and the local syslog-ng to connect to the remote aggregator.
$ sudo setsebool -P nis_enabled 1
On the Aggregator
Getting SELinux to cooperate on the web machine was frustrating but nothing compared to what it would be on the aggregator server.
In less than 5 minutes the modification to syslog-ng were in place. We change keep_hostmane to no here because we want to keep track of which web machine sent the syslog events.
# Configure syslog-ng $ diff -u prev /etc/syslog-ng/syslog-ng.conf - keep_hostname (yes); + keep_hostname (no); +source s_network { + network( + transport("tls") + port(2222) + tls(peer-verify("optional-untrusted") + key-file("/etc/pki/tls/private/aggregator.key") + cert-file("/etc/pki/tls/certs/aggregator.crt")) + ); }; $ cat /etc/syslog-ng/conf.d/gunicorn.conf destination d_gunicorn { file("/var/log/gunicorn"); }; filter f_gunicorn { program(gunicorn.*); }; log { source(s_network); filter(f_gunicorn); destination(d_gunicorn); }; $ sudo touch /var/log/gunicorn $ sudo systemctl restart syslog-ng.service
The syslog-ng.service fails to restart.
$ sudo systemctl status syslog-ng.service start request repeated too quickly for syslog-ng.service
Very helpful. Thanks. The problem is most likely unrelated to anything done "too quickly" and much to do with our old friend SELinux.
$ sudo ausearch -ts today -i | grep 'denied.*comm=syslog-ng' | audit2why
Nothing... I could have swear. Alright, let's try to run from the command line directly.
$ /usr/sbin/syslog-ng -Fevd -p /var/run/syslogd.pid
The syslog-ng daemon works perfectly then. Now that is baffling. It works, no issue from the command line, but not when run through systemd. We encountered similar issues while setting up Jenkins. It was some LXC and SELinux issues then... Let's dig further.
$ diff -u prev /etc/systemd/system/syslog.service -ExecStart=/usr/sbin/syslog-ng -F -p /var/run/syslogd.pid -StandardOutput=null +ExecStart=/usr/sbin/syslog-ng -Fevd -p /var/run/syslogd.pid +#StandardOutput=null $ sudo systemctl daemon-reload $ sudo systemctl start syslog-ng
There is nothing in the log files. Unfortunately we don't have much choice but to start browsing through the journalctl man pages.
$ journalctl _SYSTEMD_UNIT=syslog-ng.service syslog-ng: Error binding socket; addr='AF_INET(0.0.0.0:2222)', error='Permission denied (13)'
So it is a permission issue after all! Why is it not showing in /var/log/audit/audit.log?
$ journalctl _SYSTEMD_UNIT=auditd.service auditd: Audit daemon is low on disk space for logging auditd: Audit daemon is suspending logging due to low disk space.
That's it! We cleared some space but auditd never restarted writing logs by itself.
$ sudo systemctl restart auditd Failed to restart auditd.service: Operation refused, unit auditd.service may be requested by dependency only. $ sudo service auditd resume $ journalctl _SYSTEMD_UNIT=auditd.service auditd: Audit daemon is suspending logging due to low disk space. auditd: Audit daemon is attempting to resume logging. $ sudo systemctl start syslog $ sudo grep 'denied.*comm="syslog-ng"' /var/log/audit/audit.log avc: denied { name_bind } comm="syslog-ng" src=2222 $ sudo ausearch -ts today -i | grep 'denied.*comm=syslog-ng' | audit2allow -M syslog-ng $ sudo semodule -i syslog-ng.pp
Surprisingly, on the aggregator machine, installing a custom SELinux policy worked to let syslog-ng bind to the port it is listening on.
It only remains to make sure ports are open in the aggregator firewall.
$ diff -u prev /etc/sysconfig/iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT +-A INPUT -p tcp -m tcp --dport 2222 -j ACCEPT $ sudo /usr/sbin/ifup-local
Since we are running on AWS, we also need to configure the security groups such that the web tier can forward logs to the aggregator.
Configuring Django
Now that we have Gunicorn sending log records to the remote aggregator through syslog, it is time to configure Django to also send the applications log records to the remote aggregator.
First, let's read the documentation about the SysLogHandler, then do some experiment with the Python interactive shell.
$ python >>> import logging, logging.config >>> logging.config.dictConfig({ 'version': 1, 'handlers': { 'syslog':{ 'level':'DEBUG', 'class':'logging.handlers.SysLogHandler'}}, 'loggers': { '': { 'handlers': ['syslog'], 'level': 'DEBUG', 'propagate': False }}}) >>> logging.info("HELLO") $ tail /var/log/messages May 27 17:51:21 127.0.0.1 HELLO
The documentation indicates that by default, SysLogHandler
will send records as UDP packets to localhost:514, which fortunately
we configured previously syslog-ng to listen on.
There is no magic formatting. Syslog does not automatically add
the processName nor the processID. Python logging
provides a processName variable but it invariably reads
MainProcess
in the syslog output. Thus, we will just
hand craft the program name to match gunicorn.APP_NAME.app.
$ cat django_project/settings.py ... LOGGING = { 'version': 1, 'disable_existing_loggers': False, 'filters': { 'request': { '()': 'django_requestlogging.logging_filters.RequestFilter', }, }, 'formatters': { 'syslog_format': { 'format': 'gunicorn.' + APP_NAME + '.app: [%(process)d] '\ '%(remote_addr)s %(username)s %(levelname)s [%(asctime)s] %(message)s', 'datefmt': '%d/%b/%Y:%H:%M:%S %z' }, }, 'handlers': { 'syslog': { 'level':'DEBUG', 'formatter': 'syslog_format', 'filters': ['request'], 'class':'logging.handlers.SysLogHandler', }, }, 'loggers': { '': { 'handlers': ['syslog'], 'level': 'DEBUG', 'propagate': False } } } ...
Here we are using the django-requestlogging
to prefix all log messages with a remote_addr and username
(when the request.user
is authenticated).
Running further tests shows that remote_addr is consistently 127.0.0.1 which is consistent with the previously discussed Gunicorn changes (GitHub #797).
In the old times putting django.middleware.http.SetRemoteAddrFromForwardedFor
in front of django_requestlogging.middleware.LogSetupMiddleware
in MIDDLEWARE_CLASSES
would have solved the problem.
Unfortunately, this very useful middleware was deprecated in Django 1.1
and subsequently removed. We just need to re-introduce a version of it into
our own project code base.
$ cat django_project/middleware.py class SetRemoteAddrFromForwardedFor(object): """ set REMOTE_ADDR based on HTTP_X_FORWARDED_FOR. """ @staticmethod def process_request(request): if request.META.get('REMOTE_ADDR', '127.0.0.1') == '127.0.0.1': request.META.update({'REMOTE_ADDR': request.META.get('HTTP_X_FORWARDED_FOR', '127.0.0.1')}) return None $ diff -u django_project/settings.py MIDDLEWARE_CLASSES += ( ... + 'djaodjin.middleware.SetRemoteAddrFromForwardedFor', 'django_requestlogging.middleware.LogSetupMiddleware', )
At this point we are able to check the logs are making their way to the aggregator with the correct structure and IP addresses collected along the way. On each line we have in order:
- syslog date
- machine that initiated the log record
- gunicorn.app_name.{access|error|app}
- PID (in brackets)
- Remote client IP
- username (or a minus sign when unavailable)
- log level (or a minus sign when unavailable)
- log message
Logrotate
Finally we need to add our logs to the log rotation schedule on both, the web server and log aggregator.
$ diff -u /etc/logrotate.d/syslog +/var/log/gunicorn
Just one more workaround to insure that logs are rotated correctly, et voila!
$ diff -u /etc/logrotate.d/syslog +missingok
More to read
You might also like to read Multi-tier Implementation in Django or Single sign-on through OpenLDAP next.
More technical posts are also available on the DjaoDjin blog, as well as business lessons we learned running a subscription hosting platform.