Triggering a script on uploads to AWS S3
by Sebastien Mirolo on Thu, 23 Dec 2021The main reason to implement a fast path to upload logs is to be able to process them quicker than through the daily batch process. Fortunately AWS S3 can generate events into a SQS queue and/or trigger Lambda function when a new file is created in an S3 bucket.
Creating the SQS queue
Even though S3 events can trigger Lambda function (which we might ultimatey use), it is easier to send and receive messages into a SQS queue to debug the setup is correct (We don't have to deal with issues in the Lambda function itself). First we create the queue.
$ aws sqs create-queue --queue-name *example* { "QueueUrl": "*queue_url*" } $ aws sqs get-queue-attributes --queue-url *queue_url* --attribute-names QueueArn { "Attributes": { "QueueArn": "*queue_arn*" } }
Then we restrict which service can send messages to the queue.
{ "Version": "2012-10-17", "Id": "logNotification", "Statement": [{ "Sid": "logNotificationSID", "Effect": "Allow", "Principal": {"Service":"s3.amazonaws.com"}, "Action":[ "SQS:SendMessage" ], "Resource": "*queue_arn*", "Condition": { "ArnLike":{ "aws:SourceArn": "arn:aws:s3:*:*:*bucket*" }, "StringEquals":{ "aws:SourceAccount": "*account_id*" } } }] }
To set the policy from the command line, I haven't found any other way than to convert it to a string for the Policy attribute. Hence the following:
$ cat set-queue-attributes.json {"Policy": "{\"Version\": \"2012-10-17\",\"Id\":\"logNotification\",\"Statement\":[{\"Sid\": \"logNotificationSID\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"s3.amazonaws.com\"},\"Action\":[\"SQS:SendMessage\"],\"Resource\":\"*queue_arn*\",\"Condition\":{\"ArnLike\":{\"aws:SourceArn\": \"arn:aws:s3:*:*:*bucket*\"},\"StringEquals\":{\"aws:SourceAccount\":\"*account_id*\"}}}]}"} $ aws sqs set-queue-attributes --queue-url *queue_url* --attributes file://set-queue-attributes.json
Updating the S3 configuration
Now that the queue is created and S3 is allowed to send notification to it, we update the S3 configuration to do just that.
$ cat notification.json { "QueueConfigurations": [{ "Id": "logNotificationS3", "QueueArn": "*queue_arn*", "Events": [ "s3:ObjectCreated:*" ] }] } $ aws s3api put-bucket-notification-configuration --bucket *bucket* --notification-configuration file://notification.json
Testing
We will first purge the queue in case we have other processes uploading files to the S3 bucket. That will give us more chances to read the event we are looking for in this testing phase.
aws sqs purge-queue --queue-url *queue_url*
We then copy a file to the bucket and read the message queue.
$ aws s3 cp notification.json s3://*bucket* $ aws sqs receive-message --queue-url *queue_url* | jq . { "Records": [ { "eventVersion": "2.1", "eventSource": "aws:s3", "eventName": "ObjectCreated:Put", ... "s3": { "s3SchemaVersion": "1.0", "configurationId": "logNotificationS3", "bucket": { ... "arn": "arn:aws:s3:::*bucket*" }, "object": { "key": "notification.json", ... } } } ] }
Et voila!
More to read
You might also like to read:
- Logrotate, S3 storage, and AWStats
- Fast-tracking server errors to a log aggregator on S3
- Logging gunicorn messages through journald to syslog-ng
- Debugging logrotate scripts
More technical posts are also available on the DjaoDjin blog, as well as business lessons we learned running a SaaS application hosting platform.