Feeback Request: Scheduled Maintenance Plan

Below is the text of a draft plan for implementing a regularly scheduled maintenance period each week. I would like to get feedback on this, including how to make it as appealing as possible to the campus community, and how well it fits within the ITS world.

Thanks!

Curtis


Overview:

The mission of Information Technology Services is to advance the UIS vision, mission and strategic goals. A core component of achieving this mission is to deliver robust, reliable and secure information services. Inherent in this component is the need to regularly maintain the systems and networks that these services are built upon. This plan addresses the need to provide a regular time period (a “maintenance window”) for performing routine system maintenance.

Goals:

This plan is intended to balance the desire for uninterrupted access to UIS information services against the competing needs to apply critical patches, test redundancy designs, upgrade system components, and perform other routine maintenance. The goals are to keep the amount of both planned and unplanned service disruption to a minimum while at the same time providing as much advance notice as possible to the UIS community about potential periods of service disruption. Both of these goals are best achieved by establishing a regular repeating schedule for planned maintenance.

Regular Schedule:

Weekly:  The maintenance window will be from 5:00 AM till 7:00 AM each Thursday morning.

Schedule Exceptions:

During the last few weeks of the fall and spring terms, we will do whatever we can to avoid scheduling any work that would disrupt access to critical campus learning systems so that students, faculty and staff can have as much interrupted access as possible.

Additionally, there may be times throughout the year when a maintenance procedure cannot fit inside the regular maintenance window. For those rare occasions, an additional scheduled maintenance period will be announced at least one week ahead of time.

What to Expect:

At least 24 hours prior to a maintenance period, ITS will announce which services, if any, will be affected by the maintenance window, and to what extent those services will be affected (e.g. performance may be degraded, or access may be intermittent or completely unavailable). When the maintenance window arrives, every effort will be made to minimize the time needed to perform the required maintenance. When work has been completed, ITS will announce that as well.

Frequently Asked Questions (FAQ):

Q. We’ve not had regular maintenance windows in the past. Why now?

A. ITS understands the desire for there to be uninterrupted access to network services 100% of the time and we are always striving to accommodate those desires as much as reasonably possible. At the same time, we also know that the information systems we maintain are both very complex and fallible. Vendors are constantly publicizing new critical patches to help avoid system disruptions or security breaches. In the past, we’ve tried to limit applying those patches to a few times a year.

But this causes two problems. First, it leaves our systems exposed to known vulnerabilities for a much longer period of time, putting at risk the security of information stored there. Second, it requires longer maintenance periods where sometimes hundreds of patches are being applied at the same time. Managing this process is difficult; should one of the patches cause an unforeseen problem, it is difficult to know which patch is the culprit, and troubleshooting these problems increases the amount of disruption.

In addition, historically we’ve not been able to carve out the additional time needed to test our configurations that are designed to keep systems functioning even when an individual component fails. We often configure critical information systems with redundant designs, but have not believed it was feasible to take the next required step: testing those redundant designs. Testing is absolutely vital to ensuring that our failure plans are sound and can be counted on when parts do fail (which they inevitably will). Because there is an inherent risk of service disruption during failure testing of production systems, we have been reticent to perform testing in the past. We now feel that it is simply unwise to avoid performing such testing.

For these reasons, we believe it is better to have more frequent but shorter time periods to perform updates and testing. In the end, this should result in fewer and less severe times of unplanned disruptions of campus information services.

Q. What do other schools like ours do?

A. Having regular periods for scheduled maintenance is a normal activity for other higher education institutions like ours. UIC has a weekly scheduled maintenance period of 2 hours. AITS has a weekly scheduled maintenance period of 6 hours. A review of other universities’ policies also indicates that providing regular periods for maintenance is a common practice.

Q. Why Thursday mornings? Wouldn’t weekends be better?

A. Sometimes an upgrade or other work will not complete successfully on the first attempt. In those cases, it may be necessary to obtain the help of a vendor’s technical support team to quickly correct the problem. Our experience with working with many computer system vendors has shown that it is much easier to get quality technical support during the middle of the week than it is on the weekends. Scheduling the window later in the work week also allows us to be better prepared, helping to keep the disruptions to a minimum.

Q. Does this mean that there will always be disruption to UIS information system services for two hours each week? Isn’t that excessive?

A. No, that is not what we expect. The services that are impacted each week will vary, as will the level of disruption. Some weeks—perhaps most weeks—there will not be any noticeable disruption, either because there is no work scheduled or the impact is minimal or only affects specialized, non-critical systems. Our goal is still to provide for 99.99% uptime for core campus services throughout the year, even including these maintenance windows.

1 thought on “Feeback Request: Scheduled Maintenance Plan

Leave a Reply