On Tuesday, March 14th, 2023, starting at 4:32AM ET, the Customer Portal had intermittent availability and degradation issues. Yext engineers began investigating at 8:16AM ET. The majority of the issues were solved by 1:50PM ET, although some remaining issues continued until 3:36PM ET.
Increased load on the system resulted in backends being unable to acquire necessary network resources, leading to failed customer requests. Additionally, recent infrastructure upgrades resulted in a minor misconfiguration that compounded the issue for particular pages within the portal.
The team increased the size of the resource pool, which successfully caused the errors to cease. Going forward, the default pool size will be increased for all related services, additional monitoring and alerting will be added, and usage of the underlying resources will be reduced to minimize the chance of contention.