Blog

How To: Configure AWS Lambda to Automatically Invalidate CloudFront Objects

How To: Configure AWS Lambda to Automatically Invalidate CloudFront Objects

By Aaron F. April 7th, 2019

DownloadView on Github

Introduction

One common issue when taking advantage of the Amazon Cloud to optimize your website is the lack of necessary cache-control headers. If you run your website through Google Insights, one of the most common suggestions many website administrators will see is "Leverage Browser Caching." This is challenging when storing static assets in AWS S3 simply because there is no native way to apply appropriate headers to all objects. Even when using the AWS CLI from the command line (which requires an in depth knowledge of the CLI), this process can still be quite tedious.

Enter AWS Lambda.

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume - there is no charge when your code is not running. With Lambda, you can run code for virtually any type of application or backend service - all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.

In todays tutorial I will show you how to properly configure AWS Lambda & S3 to automatically add the appropriate cache control headers to any object that is uploaded, created, or changed in your S3 bucket. These headers will encourage browser caching, which will reduce bandwidth costs to you & speed up loading times for your visitors. When configured correctly, these headers will even carry over to AWS Cloudfront (CDN).

Pros & Cons

Pros:
  • Applies cache control headers any time a file is uploaded or modified, automatically.
  • Allows granular control over which buckets, folders, and file types are targeted.
  • Bills only for the resources used. For average website assets (images, etc), i've found an average runtime of less than 500 milliseconds. (See Lambda pricing)
  • Completely automated - set it & forget it!
Cons:
  • Does not apply headers to objects that already exist in your bucket.
  • Script will actually run twice. The first time to check if the object has the specified headers, and if not, it will apply them with a copy/put operation. This triggers the function to run a second time, however, since the object now has the specified header, it halts.

Getting Started

I am assuming you already have an AWS account setup. If not, head over to Amazon AWS to get an account created. Once you are done and verified, go ahead and log into your account.

Navigate to Lambda from the Services menu (It appears under the Compute section).

If you do not have any existing Lambda functions, click the blue button Get Started, or Create a Lambda Function if you have other functions already configured.

Next, you will need to pick a blue print to use for your function. Search for s3-get-object, and select it.

The next page, Configure Event Source, is where we configure what kind of event will trigger our Lambda function to run. I went with the following:

  • Event source type: S3
  • Bucket: YOUR-BUCKET
  • Event type: Object Created (All)
  • Prefix: Leave blank for entire bucket, or add a folder within your bucket to target (more folders can be added later).
  • Suffix: Leave blank for all file types, or add a specific file type to target (more types can be added later).

The next page is where we configure the actual function and add the code to be run when a trigger is received (which we configured in the previous step). Enter the following:

  • Name: Enter a name for your function.
  • Description: Enter a description for your function, or leave it as the default, or blank.
  • Runtime: Node.js 4.3
  • Code entry type: Edit code inline

Copy & paste the following code into the code box provided. Be sure to overwrite all code that is populated by default:

'use strict'; // CONFIGURATION ////////////////////////////////////////////// var CacheControlHeader = 'max-age=31536000'; /////////////////////////////////////////////////////////////// let aws = require('aws-sdk'); let s3 = new aws.S3({ apiVersion: '2006-03-01' }); exports.handler = (event, context, callback) => { const bucket = event.Records[0].s3.bucket.name; const key = decodeURIComponent(event.Records[0].s3.object.key.replace(/\+/g, ' ')); var params = { Bucket: bucket, Key: key }; s3.getObject(params, (err, data) => { if (err) { console.log(err); var message = 'Error: Failed to get object: s3://'+bucket+'/'+key +'. Make sure it is in the same region as this function!'; console.log(message); } else { const mimeHeader = data.ContentType; if (data.CacheControl != CacheControlHeader) { var params = { Bucket: bucket, Key: key, CopySource: encodeURIComponent(bucket+'/'+key), ContentType: data.ContentType, CacheControl: CacheControlHeader, 'Metadata':{}, MetadataDirective: 'REPLACE' }; s3.copyObject(params, (err, data) => { if (err) { console.log(err); message = 'Error: Failed to get object: s3://'+bucket+'/'+key +'. Make sure it is in the same region as this function!'; console.log(message); } else { message = 'Metadata updated successfully! OBJECT: s3://'+bucket+'/'+key+' CONTENT-TYPE: '+mimeHeader+' CACHE-CONTROL: '+CacheControlHeader; console.log(message); } }); } else { message = 'Metadata already updated! OBJECT: s3://'+bucket+'/'+key+' CONTENT-TYPE: '+mimeHeader+' CACHE-CONTROL: '+CacheControlHeader; console.log(message); } } }); };

Feel free to modify the CONFIGURATION section to suite your specific needs. The CacheControlHeader variable is where you specify the necessary cache-control headers to be applied to all objects.

After the code box, there are a few more options to configure. Here is what I went with:

  • Handler: index.handler
  • Role: S3 execution role - this will open a new window in AWS IAM, where you will need to configure a user to access your S3 bucket. Use the following:
    • Role Description: Lambda execution role permissions
    • IAM Role: lambda_s3_exec_role
    • Policy Name: Create a new Role Policy
    • Click the blue Allow button in the bottom right. The window will close, and the Role option will now be populated with the user we just created.
  • Memory (MB): 128
  • Timeout: 0 min, 5 sec
  • VPC: No VPC

The final page is our review page. To enable your function immediately, you will want to ensure you select the Enable event source checkbox. Click the blue Create Function button. You should be taken to your newly created Lambda function, on the Event sources tab.

Addition Configuration (Optional)

As mentioned above, when setting up your function, you must create a single event source (trigger) that will cause your Lambda function to run. Now that we have created our function, we are able to configure more granular triggers for this function. This is useful when there are individual folders or filetypes in your S3 bucket that you specifically want to target when running this function. Unfortunately, there is no way to exclude based on certain criteria, only a way to include based on given criteria.

To set additional event sources (triggers) for your function, you will need to navigate to Lambda, then click on the function we just created. From there, navigate to the Event sources tab. Just as we did when we first configured this function, you can now add additional event sources that will trigger this code to run. You can configure individual folders & file extensions in one S3 bucket, or in multiples buckets. You can re-use the same Role (user) we created above for each addition event source.

For larger files, you may need to adjust the Timeout option.

Conclusions

I hope you find this function useful. In fact, I am currently using it on this very website, and a number of client websites. If you enjoyed this blog post, please consider sharing & following me on your favourite social media site!

Topics: Amazon Web Services, Cloud Infrastructure, Platform Engineering, Web Design & Development