Postmortem for July 18, 2023 incident: Sigma outage for users using AWS/CloudFront US-EAST-1

Incident Start Time: July 18, 2023 16:37AM UTC

Incident End Time: July 18, 2023 17:13AM UTC

Summary

For approximately 45 minutes, users located in the Eastern US were unable to access the Sigma application. Attempts to log in or access workbooks received a 421 error page with a message “The request could not be satisfied” and some details about a mismatched certificate for the HTTPS connection.

Timeline

Timestamp (UTC) Event/Response
2023-07-18 16:47 Support receives first reports of customers getting 421 mismatched certificate errors trying to access their Sigma app; West coast support unable to reproduce issue
2023-07-18 17:00 Incident escalated to engineering
2023-07-18 17:06 Engineering concludes likely CloudFlare/CloudFront issue
2023-07-18 17:12 CloudFlare indicates it is AWS/CloudFront issue
2023-07-18 17:15 Customers indicate issue is resolved for them
2023-07-18 17:26 AWS reports “elevated error rates for request serviced by the CloudFront Origin Shield and Regional Edge Cache in the US-EAST-1 region” occurred between 16:37 UTC and 17:13 AM UTC

Root Cause

AWS CloudFront edge cache (CDN) was non-functional in the US-EAST-1 region for approximately 45 minutes. AWS is providing no further details as to the nature of the failure, citing only “elevated error rates.” This problem was occurring between customers’ browsers and the AWS edge cache, before any Sigma services were involved.

Scope of Impact

Sigma users hosted on AWS cloud in the US-EAST-1 region were unable to load workbooks for approximately 45 minutes. Scheduled exports and API functionality were not impacted.

Forward-looking Preventative Measures

  • Add a monitor for CloudFront errors to alert support and engineering if this condition occurs
  • Change support communication policy to update status page as soon as Sigma becomes aware of regional impacts such as this