On March 26, beginning at 11:41 AM, ET, and lasting until 11:50 AM, the Knowledge Graph experienced elevated error rates and page load failures.
An automated failover had occurred to handle hardware failures in our network system earlier in the day. When we restored the failed component, the system prematurely attempted to route traffic through the new component, resulting in a spike of connection errors. Once the restored component booted up successfully, error rates returned to normal.
We have modified our process to ensure that the system does not prematurely route traffic through network components before they are ready.