Kallidus Status

We share details of service availability and performance for Kallidus products

The service may be slow or degraded.

Return to Statuspage

Major Incident for Classic LMS, Learn and Perform

» Published on Thu, 04 Jul 2019 07:22:00 +0000

  • Post-Mortem

    Between 08:00 and 12:51 BST on 05/07/2019 we identified an issue accessing all Kallidus Perform, Learn and Classic LMS services. The issue was caused by an underlying problem with our hosting provider, Microsoft Azure. They experienced an outage within their storage infrastructure across data centre regions and was not limited to Kallidus. This caused various services that we provide to be unavailable. Under these circumstances, we were reliant on Microsoft to restore their service outage.

    Microsoft Azure has provided a preliminary root cause, which is subject to change as they continue to investigate. If there are any major changes we will send out an update:

    Azure - Summary of impact: Between 06:00 UTC and 16:25 UTC on 04 July 2019, a subset of customers leveraging Storage in UK South may have experienced service availability issues. In addition, resources with dependencies on Storage, may also have experienced downstream impact in the form of availability issues.

    Azure - Preliminary root cause: Engineers identified high levels of resource utilization on a single storage scale unit. As a result, services dependent on the storage scale unit experienced a high number of failures and latency which manifested in availability issues.

    Azure - Mitigation: Engineers manually applied load balancing configuration changes to bring the affected storage scale unit back to a healthy state. As a consequence, resource utilization levels were brought back to normal mitigating the issue.

    Azure - Next steps: Engineers will continue to investigate to establish the full root cause and prevent future occurrences.

    » Updated Fri, 05 Jul 2019 16:23:00 +0000
  • Resolved

    Normal service has now resumed. We are continuing to monitor and discuss with our hosting partner.

    Once again thank you for your patience during this incident, and please accept our apologies for the inconvenience this will have caused you.

    Once we have an RCA from our hosting partner, we will share it with all customers.

    » Updated Thu, 04 Jul 2019 12:48:00 +0000
  • Update

    Normal service is now resuming for the majority of our customers. 

    Our hosting partner is continuing to monitor and mitigate and we are doing everything we can to resume full service. As soon as we have any further update, we will be in touch. 

    Thank you for your ongoing patience, and please accept our apologies for the inconvenience this will be causing.

    » Updated Thu, 04 Jul 2019 12:06:00 +0000
  • Update

    Our hosting partner is continuing work to resolve this issue and we are doing everything we can to resume full service. As soon as we have any further update, we will be in touch.

    Thank you for your ongoing patience, and please accept our apologies for the inconvenience this will be causing.

    » Updated Thu, 04 Jul 2019 11:00:00 +0000
  • Update

    Our hosting partner is continuing to monitor and mitigate and we are doing everything we can to resume full service. As soon as we have any further update, we will be in touch.

    Thank you for your ongoing patience, and please accept our apologies for the inconvenience this will be causing.

    » Updated Thu, 04 Jul 2019 10:08:00 +0000
  • Update

    We have determined that the ongoing issue is not related to our scheduled maintenance but is due to an issue our hosting partner is experiencing with their services.

    They are working to resolve this issue and are providing updates every 2 hours to outline their progress. As soon as we have any further information, we will be in touch to advise.

    Thank you for your ongoing patience, and please accept our apologies for the inconvenience this will be causing.

    » Updated Thu, 04 Jul 2019 08:59:00 +0000
  • Update

    We are aware that this issue is also affecting Kallidus Perform. This is our highest priority and we have all available resource working to resolve this issue.

    » Updated Thu, 04 Jul 2019 08:04:00 +0000
  • Investigating

    The scheduled maintenance is overrunning due to unforeseen circumstances. This is affecting all Classic LMS and Learn customers. 

    Engineers are working to restore service as soon as possible for all affected customers. The next update will be provided in 60 minutes, or as events warrant on our status page. 

    Please accept our apologies for any inconvenience this may cause. 

    » Updated Thu, 04 Jul 2019 07:22:00 +0000