On May 3, 2018, beginning at 8:54 PM EDT, the Live API experienced increasing levels of errors, causing many requests to be served slowly or fail. Many searches on Pages Locators, which are powered by Live API, returned slowly or served no results. In response to automated alerts, we mitigated the issue at 9:18 PM EDT.
The issue was isolated to the Live API; the Platform API was not affected. Pages for individual locations and static location directory pages on Yext Pages sites were also unaffected.
The errors were caused by a failure in a third-party search infrastructure provider used by the Live API. Our mitigation was to fail over to a backup cluster hosted in an unaffected region of the search infrastructure provider. We returned to normal operations using our primary cluster at around 12:20 AM EDT on May 4.
We plan to tighten our alerting thresholds and automate the failover procedure for this failure mode to reduce the time to recovery from future similar issues.