LiveApiV2 Disruption
Incident Report for Yext
Postmortem

Summary

On May 3, 2018, beginning at 8:54 PM EDT, the Live API experienced increasing levels of errors, causing many requests to be served slowly or fail. Many searches on Pages Locators, which are powered by Live API, returned slowly or served no results. In response to automated alerts, we mitigated the issue at 9:18 PM EDT.

The issue was isolated to the Live API; the Platform API was not affected. Pages for individual locations and static location directory pages on Yext Pages sites were also unaffected.

Root Cause

The errors were caused by a failure in a third-party search infrastructure provider used by the Live API. Our mitigation was to fail over to a backup cluster hosted in an unaffected region of the search infrastructure provider. We returned to normal operations using our primary cluster at around 12:20 AM EDT on May 4.

We plan to tighten our alerting thresholds and automate the failover procedure for this failure mode to reduce the time to recovery from future similar issues.

Posted May 10, 2018 - 18:26 EDT

Resolved
This incident has been resolved.
Posted May 04, 2018 - 00:25 EDT
Monitoring
We have failed over successfully and service should be fully restored. We will continue to monitor for any issues.
Posted May 03, 2018 - 21:30 EDT
Identified
We have identified issues in Live API with a search infrastructure provider which may be causing store locator outage. We have failed over to a backup provider to restore service and we will post updates as soon as possible.
Posted May 03, 2018 - 21:23 EDT
This incident affected: Content (Content API).