At approximately 2:00 AM PST Saturday morning on February 15th, a primary storage server became unresponsive despite the best efforts of SurePrep engineers to interact with the machine. The situation was escalated to IBM data center support, and ultimately the machine was restarted. The restart resulted in the loss of a key Microsoft REFS volume containing SPbinder workpaper images, a result that is still under analysis by Microsoft Support and Development.
As soon as SurePrep engineers determined the scope of the event, they began provisioning new storage and completed that at approximately 8:30 AM PST, enabling the ongoing submission of new binders. Once the new storage was in place, SurePrep engineers began the process of transferring binder files from existing warm storage to the new live storage.
This transfer process took longer than anticipated, a condition especially true for without-leads binders due to their more complex internal data and file structures. Once a binder’s files completed this transfer process, they became viewable again in SPbinder.
In response, we have begun configuring a hot-failover storage strategy for the primary file systems to reduce future recovery time to minutes. In addition, we are partnering with Microsoft to understand and resolve the core server issue and to evaluate different infrastructure strategies. All existing back-up systems are up and running and all data and files continue to be protected.