Customer Portal Elevated Error Rates
Incident Report for Yext
Postmortem

Summary

On August 23, from 11:07 a.m. to 11:51 a.m. US Eastern Time, the Customer Portal and the Platform API experienced elevated error rates, particularly in the Knowledge Graph area.

Root Cause

A routine change to a service contained a subtle resource leak that eventually overwhelmed the machine that service was running on after many hours, negatively affecting the other services also running on the same machine.

Remediation

We will add monitoring and alerting at both the service and machine levels for the class of resource that was exhausted during this incident.

Posted Aug 30, 2019 - 17:53 EDT

Resolved
This incident has been resolved.
Posted Aug 23, 2019 - 14:57 EDT
Monitoring
A fix has been implemented and service has been restored. We will continue monitoring for issues.
Posted Aug 23, 2019 - 11:56 EDT
Investigating
We are currently investigating elevated error rates in the Customer Portal. We will update as soon as we have more information.
Posted Aug 23, 2019 - 11:34 EDT
This incident affected: Customer Portal.