Pages Serving Elevated Error Rates
Incident Report for Yext
Postmortem

Summary

On the morning of October 10th, US Eastern time, Yext engineers began tracking increasing error rates in our Pages Serving traffic. Users encountering errors were presented with an error page from our CDN provider. We immediately escalated the issue to our CDN provider. At 5:11 PM ET, our CDN provider deployed a fix in their systems, and we observed that the error rates returned to normal. The errors affected less than 1% of all Pages traffic.

Root Cause

A change made by our CDN provider caused a small percentage of connection attempts to Yext servers to time out.

Remediation

A review of our telemetry showed that this class of errors first appeared at much lower rates in the two days leading up to this incident. Because these error rates were within our service objectives, they did not trigger our standard suite of alerts. We plan to reduce noise in our error rates and increase the sensitivity of our monitoring to provide earlier warnings for future incidents.

Posted Oct 25, 2019 - 12:41 EDT

Resolved
This incident has been resolved.
Posted Oct 11, 2019 - 06:29 EDT
Monitoring
Error rates have returned to normal in Pages Serving. We will continue to monitor for additional issues.
Posted Oct 10, 2019 - 17:39 EDT
Update
We are continuing to investigate this issue.
Posted Oct 10, 2019 - 17:37 EDT
Update
We are continuing to work with our CDN provider to identify the root cause. We will update as soon as we have more information.
Posted Oct 10, 2019 - 16:25 EDT
Update
We are continuing to work with our CDN provider to identify the root cause. We will update as soon as we have more information.
Posted Oct 10, 2019 - 15:03 EDT
Update
We are continuing to work with our CDN provider to identify the root cause. We will update as soon as we have more information.
Posted Oct 10, 2019 - 13:58 EDT
Investigating
We are investigating elevated error rates in less than 1% of our Pages Serving traffic due to our CDN provider. We will update as soon as we have more information.
Posted Oct 10, 2019 - 12:58 EDT
This incident affected: Pages Serving.