On Friday, March 22nd, beginning at 11:56 PM EST, specific portions of the Platform API began returning errors. Investigation began right away. The cause of errors was identified by March 23rd, 12:30 AM EST. Mitigations were implemented by 1 AM, at which point errors subsided.
Only the Analytics and Knowledge Manager APIs were affected. The majority of other API services continued to operate normally.
A bug in our deployment infrastructure caused the backend servers for the aforementioned services to fail to start up after a routine shutdown. In addition to fixing the bug, we plan to add additional monitoring and alerting that will help us anticipate this failure mode.