Search by Algolia
What is ecommerce merchandising? Key components and best practices
e-commerce

What is ecommerce merchandising? Key components and best practices

A potential customer is about to land on the home page of your ecommerce platform, curious to see what cool ...

Catherine Dee

Search and Discovery writer

AI-powered search: From keywords to conversations
ai

AI-powered search: From keywords to conversations

By now, everyone’s had the opportunity to experiment with AI tools like ChatGPT or Midjourney and ponder their inner ...

Chris Stevenson

Director, Product Marketing

Vector vs Keyword Search: Why You Should Care
ai

Vector vs Keyword Search: Why You Should Care

Search has been around for a while, to the point that it is now considered a standard requirement in many ...

Nicolas Fiorini

Senior Machine Learning Engineer

What is AI-powered site search?
ai

What is AI-powered site search?

With the advent of artificial intelligence (AI) technologies enabling services such as Alexa, Google search, and self-driving cars, the ...

John Stewart

VP Corporate Marketing

What is a B2B marketplace?
e-commerce

What is a B2B marketplace?

It’s no secret that B2B (business-to-business) transactions have largely migrated online. According to Gartner, by 2025, 80 ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

3 strategies for B2B ecommerce growth: key takeaways from B2B Online - Chicago
e-commerce

3 strategies for B2B ecommerce growth: key takeaways from B2B Online - Chicago

Twice a year, B2B Online brings together industry leaders to discuss the trends affecting the B2B ecommerce industry. At the ...

Elena Moravec

Director of Product Marketing & Strategy

Deconstructing smart digital merchandising
e-commerce

Deconstructing smart digital merchandising

This is Part 2 of a series that dives into the transformational journey made by digital merchandising to drive positive ...

Benoit Reulier
Reshma Iyer

Benoit Reulier &

Reshma Iyer

The death of traditional shopping: How AI-powered conversational commerce changes everything
ai

The death of traditional shopping: How AI-powered conversational commerce changes everything

Get ready for the ride: online shopping is about to be completely upended by AI. Over the past few years ...

Aayush Iyer

Director, User Experience & UI Platform

What is B2C ecommerce? Models, examples, and definitions
e-commerce

What is B2C ecommerce? Models, examples, and definitions

Remember life before online shopping? When you had to actually leave the house for a brick-and-mortar store to ...

Catherine Dee

Search and Discovery writer

What are marketplace platforms and software? Why are they important?
e-commerce

What are marketplace platforms and software? Why are they important?

If you imagine pushing a virtual shopping cart down the aisles of an online store, or browsing items in an ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is an online marketplace?
e-commerce

What is an online marketplace?

Remember the world before the convenience of online commerce? Before the pandemic, before the proliferation of ecommerce sites, when the ...

Catherine Dee

Search and Discovery writer

10 ways AI is transforming ecommerce
e-commerce

10 ways AI is transforming ecommerce

Artificial intelligence (AI) is no longer just the stuff of scary futuristic movies; it’s recently burst into the headlines ...

Catherine Dee

Search and Discovery writer

AI as a Service (AIaaS) in the era of "buy not build"
ai

AI as a Service (AIaaS) in the era of "buy not build"

Imagine you are the CTO of a company that has just undergone a massive decade long digital transformation. You’ve ...

Sean Mullaney

CTO @Algolia

By the numbers: the ROI of keyword and AI site search for digital commerce
product

By the numbers: the ROI of keyword and AI site search for digital commerce

Did you know that the tiny search bar at the top of many ecommerce sites can offer an outsized return ...

Jon Silvers

Director, Digital Marketing

Using pre-trained AI algorithms to solve the cold start problem
ai

Using pre-trained AI algorithms to solve the cold start problem

Artificial intelligence (AI) has quickly moved from hot topic to everyday life. Now, ecommerce businesses are beginning to clearly see ...

Etienne Martin

VP of Product

Introducing Algolia NeuralSearch
product

Introducing Algolia NeuralSearch

We couldn’t be more excited to announce the availability of our breakthrough product, Algolia NeuralSearch. The world has stepped ...

Bernadette Nixon

Chief Executive Officer and Board Member at Algolia

AI is eating ecommerce
ai

AI is eating ecommerce

The ecommerce industry has experienced steady and reliable growth over the last 20 years (albeit interrupted briefly by a global ...

Sean Mullaney

CTO @Algolia

Semantic textual similarity: a game changer for search results and recommendations
product

Semantic textual similarity: a game changer for search results and recommendations

As an ecommerce professional, you know the importance of providing a five-star search experience on your site or in ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

Our team recently implemented an internal static website that allows employees to download technical reports. Since we’re heavy AWS (Amazon Web Services) users, we naturally decided to host it on AWS S3, which provides a dedicated feature to build static websites (S3 static website hosting).

Very quickly, however, we ran into an issue: AWS S3 does not provide any native, out-of-the-box authentication/authorization process. Because our website was going to be internal-only, we needed some kind of authorization mechanism to prevent non-authorized users from accessing our website and reports.

We needed to find a solution to secure our internal static website on AWS S3.

Discovering the solution with Amazon CloudFront and Lambda@Edge

We use Okta for all Identity and User Management, so whatever solution we found had to plug-in with Okta. Okta has several authentication/authorization flows, all of which require the application to perform a back-end check, such as verifying that the response/token returned by Okta is legitimate.

So we needed to find a way to carry these checks/actions on a static website which uses a back end that we don’t control. That’s when we learned about AWS Lambda@Edge, which lets you run Lambda Functions at different stages of a request and response to and from Amazon Cloudfront:

cloudfront events that trigger lambda functions

As the diagram indicates, we can trigger a Lambda Function at four different stages:

  • When the request enters Amazon Cloudfront (viewer-request)
  • When the request goes out to the origin (origin-request)
  • When the response is returned from the origin (origin-response)
  • When the response is returned from Amazon Cloudfront (viewer-response)

We saw a solution to our original issue: trigger a Lambda at the viewer-request stage that would check if the user is authorized.

There were two conditions:

  1. If the user is authorized, let the request continue and return the restricted content for safe content delivery
  2. If the user is not authorized, send an HTTP response to redirect them to a login page

cloudfront lambda check authorientication

Implementing the Lambda@Edge function

We’ll cover here the key elements and main issues we faced. The complete code is available here. Feel free to use it in your project!

Lambda@Edge restrictions and caveats

As we developed the solution, we ran into several restrictions and caveats of Lambda@Edge.

1 – Environment variables

Lambda@Edge Functions cannot use environment variables. That meant that we needed to find another way for making data transfers in our function. We opted for SSM parameters and templated parameter names in the Node.js code (we use Terraform to render the template when deploying the Lambda Function).

2 – Lambda package size limit

For viewer events (reminder: we use the viewer-request event), the Lambda package can be 1 MB at most. One MB is pretty small considering that it includes all dependencies (except of course the runtime/standard library) of your Lambda Function.

That’s why we had to rewrite our Lambda in Node.js instead of the original Python, because the Python package with its API and other dependencies exceeded the 1 MB limit.

3 – Lambda region

Lambda@Edge functions can only be created in the us-east-1 region. It’s not a big issue but it means you’ll need to:

  • Provision your AWS resources in that region to make things easier
  • In Terraform, you’ll need to have a separate AWS provider to access the bucket you want to protect if it’s not in us-east-1

4 – Lambda role permission

The IAM execution role associated with the Lambda@Edge functions must allow the principal service edgelambda.amazonaws.com in addition to the usual lambda.amazonaws.com. See AWS – Setting IAM permissions and roles for Lambda@Edge.

Authorization mechanism with Okta

Once we managed the above restrictions and caveats, we focused on the authorization/authorization.

Okta offers several ways to authenticate and authorize users. We decided to go with OAuth2, the industry-standard protocol for authorization.

Note: Okta implements the OpenID Connect (OIDC) standard, which adds a thin authentication layer on top of OAuth2 (that’s the purpose of the ID token mentioned hereafter). Our solution would also work with pure OAuth2 with minimal modifications (removal of the ID token use in the code).

OAuth2 itself offers several authorization flows depending on the kind of application using it. In our case, we needed the Authorization Code flow.

Here is the complete diagram of the Authorization Code flow taken from developer.okta.com that shows how it works:

oauth authentication/authorization code grant flow.

To summarize the flow:

  • Our Lambda Function redirects the user to Okta where they will be prompted to login
  • Okta redirects the user to our website/Lambda Function with a code
  • Our Lambda Function checks if the code is legitimate and exchanges it for access and ID tokens by sending a request to Okta
  • Depending on the result returned by Okta, we:
    • Allow or deny access to the restricted content
    • If access is allowed, save the access and ID tokens in a cookie to avoid having to re-authorize the user on every page

Using JSON Web Tokens to store authorization result

So far we have a working authorization process; however, we need to check the access/ID token on every request (a malicious user could forge an invalid cookie or tokens). Checking the tokens means sending a request to Okta and waiting for the response on every page the user visits, which slows down the latency of Cloudfront CDN and loading times significantly and is clearly sub-optimal.

Note: While local verification of the Okta token is theoretically possible, as of this writing the SDK provided by Okta uses a LRU (in-memory) cache when fetching the keys used to check the tokens. Because we’re using AWS Lambda, and the memory/state of the program isn’t kept between invocations, the SDK is useless to us: it would still send one HTTP request to Okta for every user request, to retrieve the JWKs (JSON Web Keys). Worse, there’s a limitation of 10 JWK requests per minute, which would make our solution stop working if there were more than 10 requests per minute.

To resolve this, we decided to use JSON Web Tokens, as we did for our admin application. The initial authorization process is the same except that, instead of saving the access/ID tokens into a cookie, we create a JWT containing these tokens, and then save the JWT into a cookie.

Since the JWT is cryptographically signed:

  • A malicious actor cannot forge one (they would need the private key used to sign them)
  • The checking step required on every request is fast: we traded a long and I/O expensive HTTP request to compute a quick cryptographic check.

Note on JWT expiration and renewal

The JWT has a relatively short pre-defined expiration time to avoid having a valid JWT containing expired or revoked access/ID tokens. Another option would be to check the access/ID tokens regularly and revoke the associated JWT if needed, but then we would need a revocation mechanism, which makes things more complex.

Finally, as suggested above, the tokens provided by Okta have an expiration time. It is possible to transparently renew them using a refresh token (so the user doesn’t have to re-login when the tokens expire) but we didn’t implement that.

Conclusion

While adding OAuth2 authentication to an S3 static bucket with Okta (or any other OAuth2 provider) is possible in an AWS-integrated and secure manner, it’s certainly not straightforward.

It requires writing a middleware between AWS and the OAuth2 provider (Okta in our case) using Lambda@Edge. We had to do the following ourselves:

  1. Validate the user authentication
  2. Remember the user authentication
  3. Refresh the user authentication (not implemented in our solution)
  4. Revoke the user authentication (TTL is implemented, but revocation before the end of the TTL is not)

Finally, a bunch of AWS resources must be created to glue everything together and make it work.

All this was worth the effort, because it works and our website is now more secure.

You can find the code of the Lambda@Edge as well as the infrastructure (Terraform) here: https://github.com/GuiTeK/aws-s3-oauth2-okta.

About the author
Guillaume Truchot

Site Reliability Engineer

github

Recommended Articles

Powered byAlgolia Algolia Recommend

API keys vs JWT authorization - Which is best?
engineering

Julien Bourdeau

Software Engineer

Good API Documentation Is Not About Choosing the Right Tool
engineering

Maxime Locqueville

DX Engineering Manager

Introducing our new navigation
product

Craig Williams

Director of Product Design & Research