For intermittent periods of time between 11/25 - 11/27, customers saw scan failures in both the API and the Scan tool. On 11/25, errors were seen from approximately 6:30PM ET until 7:50PM ET. On 11/26, errors were seen over 3 small periods, totaling approximately 45 minutes. On 11/27, errors were seen from approximately 3:15AM ET until 6:00AM ET and then again approximately from 4:00PM ET until 4:50PM ET.
A network file share system was experiencing issues resulting in the need to failover to a backup. This occurred on 11/25, starting at approximately 5:00PM ET.
The backup system experienced a series of issues resulting in the server becoming unresponsive and crashing between the evening of 11/25 and the afternoon of 11/27.
The degraded state of the backup system led us to fail back to the primary on the afternoon of 11/27, starting approximately at 4:00PM ET.
To prevent similar issues from occurring again, we will remove the existing dependency on the shared file system for the Scanning system.
Additionally, we are adding telemetry to notify us more quickly if the scanning system experiences similar problems in the future.