Handling uploaded objects to S3 with SQS and Lambda

18 Jan 2021

6 min read

AWS

TL;DR: In this post, I'll describe an often-used pattern for managing objects uploaded to S3. A Lambda function handles the objects uploaded to the bucket. The architecture also uses the built-in notification feature of S3 and SQS.

Amazon S3 is used in almost every AWS-based application, and as such, it’s a very important part of any architecture.

1. Automated notifications

When dealing with S3, it’s often required to apply some logic when users or applications upload objects to the bucket.

Luckily, S3 has an event notification mechanism and it lets us configure different target services when something interesting happens in S3 (object creation or deletion, just to mention two). These targets can be SQS, SNS or Lambda.

2. Architecture

This architecture pattern has a standard SQS queue configured as the target when objects get uploaded to S3 (s3:ObjectCreated:*).

Handling uploaded object to S3 using Lambda — Handling uploaded object to S3 architecture

As soon as the queue receives a message from S3, it will trigger a Lambda function (creatively named as S3HandlerFn). The SQS event object looks like this:

{
  "Records": [
    {
      "messageId": "059f36b4-87a3-44ab-83d2-661975830a7d",
      "receiptHandle": "AQEBwJnKyrHigUMZj6rYigCgxlaS3SLy0a...",
      "body": "Test message.",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1545082649183",
        "SenderId": "AIDAIENQZJOLO23YVJ4VO",
        "ApproximateFirstReceiveTimestamp": "1545082649185"
      },
      "messageAttributes": {},
      "md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:us-east-2:123456789012:my-queue",
      "awsRegion": "us-east-2"
    },
    {
      "messageId": "2e1424d4-f796-459a-8184-9c92662be6da",
      "receiptHandle": "AQEBzWwaftRI0KuVm4tP+/7q1rGgNqicHq...",
      "body": "Test message.",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1545082650636",
        "SenderId": "AIDAIENQZJOLO23YVJ4VO",
        "ApproximateFirstReceiveTimestamp": "1545082650649"
      },
      "messageAttributes": {},
      "md5OfBody": "e4e68fb7bd0e697a0ae8f1bb342846b3",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:us-east-2:123456789012:my-queue",
      "awsRegion": "us-east-2"
    }
  ]
}

A few things to note here:

SQS can batch multiple records (messages) in one notification, so the Lambda function needs to handle that.
The body property contains the message itself. The attributes and messageAttributes are also useful properties where we can send meta data to the Lambda function.

In our example, S3 will send a message to the dedicated SQS queue after an object gets uploaded to the bucket. The body property (whose value is a stringified object) of each Record has the reference to S3 (after parsing):

"Records": [
  {
    // ...
    "eventName": "ObjectCreated:Put",
    "s3": {
      // ...
      "bucket": {
        "name": "THE NAME OF THE BUCKET",
        // ...
      },
      "object": {
        "key": "THE KEY OF THE UPLOADED OBJECT",
        // ...
      }
    }
  }
]

We can see that Lambda doesn’t receive the object itself but just a reference to the bucket and the object.

It means that the function needs to extract the bucket name and object key from the record and explicitly download the object from S3 using the GetObject API call. The Bucket and Key parameters for the API call will be the bucket name and object key, respectively. They will then get passed on to Lambda in the SQS message.

From here, the Lambda function can do whatever it needs to do with the object (transforming, uploading to another bucket, persist to a database, etc.).

3. Advantages and disadvantages

This pattern has the following advantages and disadvantages.

3.1. Advantages

Decoupling

When Lambda can’t keep up with the uploads for any reason (due to under-configured Lambda concurrency or downstream error, for example), the messages will remain in the queue so that they can be processed later. The retention period is four days by default, but it is configurable. I have set this value to 1209600 seconds in the template, which is 14 days (this is the maximum configurable value).

This way, no messages will be lost.

Automation

The process is fully automated, and there’s no need for manual intervention (as long as everything goes well, see below).

After Lambda has processed the message, it will delete it from the queue. It is good news for us because we don’t have to write code for this step in the function.

Re-try and dead-letter queue

If something goes wrong, i.e., if the Lambda function throws an error, the message will get back to the queue, and Lambda will try to process it again.

The RedrivePolicy of the queue determines how many times we want Lambda to try to process the message. If maxReceiveCount property is set to 2 (just like in the template in this case), Lambda can try processing the message three times before it is moved to the dead-letter queue, which is specified in the same section of the CloudFormation template.

Alarm on the dead-letter queue

A CloudWatch alarm can be set up if the number of messages exceeds a pre-defined threshold in the dead-letter queue.

If messages appear in the dead-letter queue, it will indicate that something went wrong in the source queue.

When such an event occurs, CloudWatch alarm will publish an event to an SNS topic, and we can decide how we want to receive the notification (email, SNS, Slack, etc.). Upon receiving the alert, we can investigate what has happened with the messages.

3.2. Disadvantages

Re-drive

Re-trying the messages has some disadvantages as well. When many objects get created in the bucket in a short period, the source queue can build up very quickly if any error occurs in the Lambda. If, for example, there is a formatting error and the function tries to read a property from the uploaded object that doesn’t exist, it will throw an error. Even if SQS resends the message to Lambda in this case, the result won’t (and can’t) be different, and the error will come up again.

A sophisticated error handling practice can decide which errors are worth re-trying (usually system and throttling errors) and which ones are not (for example, validation or formatting errors).

Slow processing time

The processing time of the accumulated messages can be slow. Depending on the number of messages, it can take any time from a few seconds to a few hours.

In general, if everything goes well, this pattern works very quickly, usually within a second, but in case of a high volume when the consumer (the Lambda function) can’t cope with the load, the service can slow down.

We can reduce the likelihood of slow queue processing time with the proper configuration of the function (timeout, error handling, concurrency).

At least once delivery

Standard queues have at least once delivery, which means that one message can appear more than once. Therefore, the Lambda function needs to be idempotent and shouldn’t treat the same message differently.

4. Summary

This pattern is an often-used way to handle S3 events and works well in most circumstances.

The code and the CloudFormation template can be downloaded from the GitHub repo of the project.

Thanks for reading and see you next time.