Created
May 29, 2011 11:33
-
-
Save wchristian/997673 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hello IOFLOOD customers, | |
I wanted to let you all know about some networking issues that we've been having today , and to a lesser extent yesterday, as well as let you know what we've done to resolve these problems. | |
First of all, yesterday, we had to suspend a customer for abuse. The server was being used as an open proxy, and to distribute malware, and as a phishing site. Upon closing down the server, clients of that server continued to send traffic to it. Since the server was offline, there was nothing to reply to this traffic, and our switch started sending "destination unreachable" messages to the clients trying to connect to the server. | |
Normally, this would be ok and not a problem. However, these unreachable messages were being sent using the switch cpu, which is not terribly fast. Due to the large number of these messages being sent, the network yesterday was not very stable for a couple hours, although actual downtime was only a few minutes. We had our upstream provider nullroute the offending ips, and this brought back the network stability at that time. | |
Because of this, looking into the issue, we found that a newer version of our switch firmware would allow us to disable these "destination unreachable" messages so as not to have this problem again. We had planned to do this upgrade sometime next week after making sure we had everything in place for all possible contingencies and to minimize downtime. | |
Unfortunately, another server on our network started having issues today, and, as a result of this server having issues, our switch again was having to send packets using its cpu, causing the switch to overload and the network to become unstable. During the process of troubleshooting this issue and trying to resolve it, we lost remote connectivity to the network. We then dispatched our technician to the datacenter, while also working with our upstream provider and the datacenter staff to resolve this issue. The network was down for approximately 1 hour during this time. | |
After restoring network connectivity, we made preparations to upgrade the switch firmware to a version that does not suffer these serious issues. During the upgrade process, the network was again inaccessible for approximately 15 minutes. The firmware upgrade went as planned and the network came back up at this time. We've now made the appropriate configuration change that should prevent this issue from recurring. | |
Attached is a copy of the relevant switch cpu graph during the network instability, during the downtime, and after the switch upgrade. | |
On behalf of everyone here at IOFLOOD would like to sincerely apologize for this network downtime. Please be assured that we had no reason to expect this to happen, and have done what we can to mitigate the effects and restore stable connectivity as quickly as possible. | |
If you have continuing to have any network problems, or have any other questions or concerns, please let me know and I'd be glad to help however I can. | |
Sincerely, | |
Gabe |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment