Overview of Incident
On Wednesday, August 21, 2024, ClassLink experienced an issue that temporarily limited the ability of some users to log in to LaunchPad. Our infrastructure team promptly addressed this incident to restore standard functionality.
What Happened
At approximately 9:07 AM ET, our monitoring systems detected that the caching cluster responsible for handling authentication requests was experiencing extremely high CPU usage. This led to slowdowns and prevented some customers from logging into LaunchPad.
Our team quickly identified that traffic in one of our server clusters was not evenly distributed across the servers in the cluster, with one server handling the majority of the traffic and causing a CPU bottleneck. To address the issue, our infrastructure team provisioned a new server cluster; however, the transition to the new server cluster did not occur quickly enough to prevent login slowdowns.
Resolution
Partial restoration of services was achieved by 9:16 AM ET, with full restoration of login functionality by 9:33 AM EST. Further actions are being taken to ensure traffic is reliably distributed evenly across all servers, which will prevent similar issues from occurring in the future.
Timeline
- 9:07 AM EST: Initial detection of the issue
- 9:16 AM EST: Partial restoration of services
- 9:33 AM EST: Full restoration of login functionality
Future Technical Plans
While the immediate issue has been resolved, the team is actively working on a permanent solution to ensure an even traffic distribution across all servers in the cluster. In addition, we are reviewing procedures and server configuration options to better ensure infrastructure transitions can be made without impacting performance.
We appreciate your understanding and patience as we continuously improve our services. If you have any questions or need further information, please email us at helpdesk@classlink.com.
Thank you for your continued trust in ClassLink.