Issue with Live API and Locators Consumer Serving
Incident Report for Yext
Postmortem

Summary

On June 23rd, 2019 at 10:58 a.m. ET, Yext engineering was alerted of increased error rates in the Live API platform. Engineers began investigation, and saw that our third party provider was experiencing issues. Yext systems automatically failed over to a backup provider by 11:08 a.m. ET, but the service continued to experience increased latency and error rates. Mitigation efforts continued and error rates were returned to normal by 11:19 a.m. ET.

Root Cause

The Live API platform automatically fails over to a backup provider in the event of an outage at our primary provider. The initial alert at 10:58 a.m. ET was caused by an incident in our primary provider. Our systems automatically failed over to the backup provider, but this provider was underprovisioned and thus unable to serve production load.

Remediation

We plan to add improved monitoring and processes to ensure that our backup systems are able to serve production load during provider outages.

Posted Jul 08, 2019 - 14:33 EDT

Resolved
This incident has been resolved.
Posted Jun 23, 2019 - 19:23 EDT
Update
We continue to see normal performance. The incident remains open at our provider so we are continuing to monitor.
Posted Jun 23, 2019 - 15:23 EDT
Monitoring
The system is back to normal operations, we are continuing to monitor.
Posted Jun 23, 2019 - 11:23 EDT
Identified
We have identified an issue with an underlying data store that is causing some requests to respond slowly or fail. We are putting mitigations into effect.
Posted Jun 23, 2019 - 11:15 EDT
This incident affected: Live API.