For all three of my regular readers, sorry for the slow updates. I’ve decided to make the site a little less newsy and more about what I’m learning on a day-to day basis.
And I’ve been learning quite a bit, because about three weeks ago, our internet usage spiked incredibly – saturating our T1s out to the internet and slowing traffic down to a crawl.
The chart below shows the traffic flowing into our district from our ISP. Note the sharp increase on Weds in week 15 – that’s when the trouble started.
I first became aware of the problem when our AP and payroll clerks came to ask why their terminal screens were taking 3-5 seconds to display every letter typed they typed. I checked the normal things – CPU utilization on my router and ATM switch (they would be high in a virus outbreak), Checked my Fluke One Touch to check for broadcast storms or excessive errors, and checked my mail server for signs of spamming. Nothing seemed out of the ordinary – low CPU utilization, no network problems, no spam.
I called the Lead Tech at my ISP to see if he saw anything going on out of the ordinary. He checked for PTP traffic, spy/adware traffic and streaming media and found just standard web traffic going on – his traffic shaping box didn’t see anything out of the ordinary. I decided to give it a day or two and see if it settled down on it’s own – while figuring out what my options were if it didn’t.
Monday found the traffic still heavy and me scrambling for solutions. I decided to set up Squid on the SuSE 9 Enterprise Server I had recently set up for testing the possibility of migrating my web server to Linux. The install went fairly smoothly with YaST, and I had Squid up and running in a matter of minutes. The only problem is, that it comes programmed to disallow all connections by default. I just needed to edit the config files and allow the machines in my network to talk to it.
At this point in time, I decided to install Webmin, as it simplifies this type of task tremendously. Unfortunately, it isn’t included in the base SuSE packages, but I was able to download a RPM off of the site and get it up and running in no time. After a bit of ACL tweaking, I was able to point my workstation to the proxy and hit the web.
I used Windows group policies to set a couple of grades to use the proxy Monday afternoon, and let them run Tuesday to make sure the server didn’t choke on a fair-sized load. Tuesday also kept me busy setting up the Fluke OptiView on loan from my ISP. Initial tests with the OptiView showed nothing out of the ordinary. I set the proxy as on and active in all Windows Group Policies later Tuesday afternoon.
I was rewarded with a greatly reduced traffic load on Wednesday (week 16). Telnet traffic was smooth again, and the web lost the slow as molasses feeling it had developed. I wasn’t quite back to pre-spike levels, but I had seen a lot of streaming video form ESPN, and hoped that blocking it would bring me where I wanted to be. I patted myself on the back and decided all I needed to do was decide on if I should use the linux/squid combo going forward, or if I should pick up a commercial appliance that would do the same.
How wrong I was……. the gory details are forthcoming in my next installment in which we see bandwidth continue to rise, and the real culprit unmasked.
T.J. — Nothing to do with this post, but I’m a dev on GM, and saw your post on /. about GM crashing FF on OS X.
Please email me with some details. What version of FF, what version of GM, what scripts were installed, etc.
0.2.5 was pretty buggy, but other than that, I haven’t heard any such complaints before.
jdunck at gmail daht com