Status update at 8:00 pm

As of 6:00 pm this evening, Oracle engineers have successfully reconstructed all of the failed components of the disk storage array that has been out of service since early Tuesday morning. They are now in the process of reconnecting these components to make all of them available to the storage array’s controllers. This process will continue for the next few hours after which NJIT staff will begin re-starting NJIT applications and making them available to the campus community. Since there are over 100 applications that need to be “re-started”, it will take 8-12 hours before all can be re-started, and some may take up to an additional 12 hours before they reach normal operating capacity.

Priority is being given to Highlander Pipeline, ADM e-mail, and AFS student file systems which should begin operations later this evening. All ADM e-mail received since Tuesday morning has been held and will be gradually delivered to ADM mailboxes over the next 24 hours.

Status updates will continue to be posted on NJIT SOS as major applications are re-started and reach normal operating capacity.

Thank you for your patience.

Published by njitsos

This blog will provide the NJIT community updates about system outages.

13 thoughts on “Status update at 8:00 pm

    1. @Fez: We have more than one disk array. This hardware failure happened to occur in a large array that has many important systems distributed inside it.

  1. Can you please explain on the blog how an outage such as this can occur. I thought you had enterprise-class technology in the backrooms of NJIT.

    1. @Mark: NJIT does have enterprise-class hardware, with redundancies. But no hardware is infallible, and sometimes multiple hardware failures happen — even to redundant, constantly-monitored parts. The hardware failure isn’t what took the most time; because of the importance of the data involved, we were extremely careful to do the proper integrity checks on the data and drives, which, on 220+ harddisks and 54 terabytes of data, is not a quick-and-simple process.

      The alternative — restoring the data from off-site backups — would’ve taken an order of magnitude longer.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: