Value of site monitoring

As it has been previously establish, I handle a couple of servers and a small network in a Sacramento datacenter.  I had set up all kinds of internal monitoring on bandwidth, server health, etc.  But one thing I never did before was to setup external monitoring.  Site monitoring can be invaluable to help manage your site and the services you use.

The reality, things do go down from time to time.  Most of the time it is for maintenance purposes (or at least, it should be).  However sometimes, things do go wrong.  Having quality site monitoring gives you instant notification (if it is something you can handle), history of information on outages, and can even help you keep a provider tied to their Service Level Agreement.

I recently signed up for monitoring with Pingdom, which offers attractive pricing for multi-site monitoring (since one location is never enough), and also monitoring with checks at 1 minute.  These are two important points.

First, most others will offer ever 5 minutes standard, or you can pay extra for 2 minutes.  1 minute resolution gives you the absolute fastest notification, and can pickup the smallest outages.  With 5 minutes, you could almost reboot a server in that time.  That is a definite outage, but it could completely miss it.

The second important thing is multi-site monitoring.  This is a necessity.  Over the weekend, my server had its first actual network issue.  It was outside of my control, but basically, I get access from two providers: Qwest and Verizon Business.  Verizon was having some network issues in the Sacramento area that was causing traffic coming in through them to be dropped, but traffic from Qwest was fine.  This caused a sort of brown out.  Server was accessible depending on which route traffic was coming in from.  With site monitoring that only checked from a single location, it could show the site as operating just fine.  However Pingdom offers like 7 worldwide locations they check from.  This allows for high certainty of its accuracy.

On top of that, you can purchase additional service checks for only $0.50/month, so I have all my main services monitored, and even several of my upstream provider's services monitored, so in the event this happens again, I'll clearly know if it is within my network, upstream, and narrow it down right away.

May sound like a commercial, though Pingdom really is quite cool.  They even give response times on all checks for some basic historical performance data.

Monday, November 12, 2007

