Many Campus Websites Are Down
Incident Report for University of Guelph IT System
Resolved
Following the maintenance on Tuesday, the web infrastructure has remained stable, we consider this issue now resolved. Thank you for your continued patience.
Posted Apr 21, 2017 - 12:22 EDT
Update
We saw a brief but related outage on Sunday afternoon, from approx. 2:45-3:10 PM. We believe the problem is related to a piece of infrastructure that provides a shared storage environment for many CCS-hosted websites. Websites that are using the workaround from earlier in the week were not affected yesterday, but websites using the shared storage environment were.

Our vendor recommends a configuration change to the shared storage infrastructure to help prevent future outages. We will make this configuration change during tomorrow's regularly scheduled CCS maintenance window between 6:00-8:00 AM. During this maintenance window, some websites will be briefly offline again while we make the change. We apologize for the very short notice.
Posted Apr 17, 2017 - 13:31 EDT
Monitoring
Yesterday's workarounds remain in place, and there were no new website outages overnight. We continue to investigate the root cause. This may be a longer term effort that results in a future scheduled maintenance window. At this point we continue to monitor the situation.
Posted Apr 12, 2017 - 10:00 EDT
Update
We are pleased to share that we have implemented workarounds for all of the websites that were still offline as of about 3:30 PM. These websites have been online and operational since.

Unfortunately, we still don't understand the root cause of this problem and continue to investigate. Because we haven't yet identified or resolved the root cause, there is some risk that some websites may go offline overnight. We will be monitoring closely and will engage our on-call procedures if necessary.

We will update the community again by 10:00 AM Wednesday at the latest.
Posted Apr 11, 2017 - 16:35 EDT
Update
We continue to investigate these outages, but we do not have a solution to report. At this time we do not expect a solution by end of business day. If we have anything further to report, we will do so by 5:00 PM today. Otherwise will will provide an update by 10:00 AM tomorrow.
Posted Apr 11, 2017 - 14:59 EDT
Investigating
While the majority of these CCS-hosted websites remain online, we have not yet identified the root cause of the problem, and there are still several websites that we cannot yet bring back online. We are working on a number of options behind the scenes, and initiating a restore from backup in the background in case we determine we need it.

We will provide another update by 3:00 PM at the latest.
Posted Apr 11, 2017 - 13:07 EDT
Monitoring
The majority of websites are now back up and running. CCS will continue to work on restoring the remaining few websites that are down and will continue to monitor the situation to ensure all sites remain up.
Posted Apr 11, 2017 - 09:12 EDT
Investigating
CCS is investigating an issue with a large number of campus websites. When attempting to access the website, a "502 - Bad Gateway" error message is displayed. The issues started appearing around 4:30am this morning.
Posted Apr 11, 2017 - 07:44 EDT