IT administrators today have quite a bit to manage every day, and what’s worse is the influx of things to do seem to have no end. In this blog post I’d like to address a few of the things that you can do to eliminate repetitive non-value added tasks from your routine.
Proactive monitoring of your messaging and collaboration systems can alleviate a great deal of time spent reviewing logs and performance counters. Employing a solution that can automate these activities and alert in case of issue will free you up to do more value added tasks.
There are only three options when it comes to monitoring your messaging and collaboration environment.
- Manually-This is not a very realistic option, especially when many users can be impacted. It is possible though to perform certain checks at certain point in the day to validate service. Also, as mentioned above, you can keep a close eye on logs and performance counters. This method is clearly unrealistic in most scenarios as the majority of your time would be spent just making sure there are no issues.
- Scripted-There are a good number of resources for scripted monitoring available on the web (at least for certain components). While a scripted option will be able to satisfy some concerns there can still be areas that are vulnerable. With that being said, someone who is very proficient with scripting (and has many hours free) will be able to put into use a fairly simple monitor method.
- Third Party-The third party option usually has cost, but with that it brings a great deal of benefit. A third party monitoring tool or software will usually address each specific monitoring need, automated and reliable. These solutions tend to be highly customizable and can easily scale with your environment
Having a good monitoring solution in place will allow you to place your focus on maintaining and improving your environment while ensuring that your services are being delivered to your customersproperly.
Reporting & Trending
How are your servers performing over time? How often are alerts generated? By creating and delivering custom reports you can easily answer these questions and have instant access to the data you need to correlate performance measures in your environment with service events. As with monitoring there are three options here as well.
- Manual-Manually monitoring your environment can be time consuming. Manually collecting, reporting and trending data on it can be much, much more unrealistic. If you’d like I can discuss that further, but for most (if not all) this is not an option.
- Scripted-As with monitoring scripted data collection, reporting and trending is possible however the information gathered can be limited and the creation process can be very time consuming.
- Third Party- Most monitoring tools also collect and retain data for future analysis. In any tool this should be a requirement as the information can be very valuable in analyzing your environments performance and availability. Again third party tools tend to be reliable and automated, simplifying the process.
Reporting and trending can make your life easier by providing consolidated information allowing you to quickly pinpoint performance issues, typically, before they impact your end users. It is often easier (and less time consuming) to remedy a potential issue than it is to resolve a full blown service outage.
Once you have good metrics on performance, capacity and availability you can use that information to forecast how your environment will be performing in the future. This information can be critical when it comes time to purchase new equipment as you will have a clear picture of growth over time. Realistically there are probably only two options for forecasting the growth of your environment.
- Scripted- If you’ve scripted monitoring and the collection of data, reporting and trending, I suppose you’re a master script wizard (See Tyler). If this is the case I’m fairly certain that you can create a mathematical formula that will allow you to see into the future, thus giving you the ability to forecast your environments needs. You will have no problems determining the amount of disk you need, the number of servers, your bandwidth usage, etc. You will have spent an eternity getting to this point, but you have arrived.
For everyone else that is not a master script wizard (with unlimited free time), please see below.
- Third Party- Not all third party tools have the ability to forecast growth built-in. It’s an important feature because it allows you to have a true picture of where you’ve been and based on that where you’re going. This can be summed up in a simple report delivered at your convenience that you can reference when it comes time to place your orders for additional capacity.
If you’ve maintained historical data on the performance and usage of your environment it can greatly assist in planning for the future. Forecasting takes the guess work out of the equation, having real statistics on your service usage allows you to have a better idea of what your needs will be in the future, based on you growth rate in the past. This will allow you to spend less time guessing and more time improving.
Having metrics on performance and availability will allow you to create an easy comparison with your Service Level Agreement allowing you to identify which performance metrics are impacting your service. If you correlate this with each server performance you can clearly see which servers are impacting service delivery. Ok, one realistic option here, I know, “but it can be scripted”, let’s leave that for another post.
- Third Party-As the only real option for SLA management, some third party tools allow you to take everything you’ve learned from above (monitoring, reporting and trending, forecasting) and apply it to what matters most, how did we do against our customers’ expectations? If you truly want to get the most out of your Monitoring and Reporting solution you need to have the ability to compare your performance against the SLA with metrics from individual components supporting that metric. In this fashion, we can see which components are negatively impacting service delivery. For example, let’s say you have an SLA of 5 minutes round trip mail delivery
SLA’s, typically not a bright point in conversation. So how can we use SLA’s to our advantage? With the proper tools, we can show without a doubt, that our service meets the expectations of our customers. What’s more, if we have the ability to see which components are negatively impacting our SLA, we can resolve problems before the customer even notices. The more you know about your environments performance, the less you need to be discussing it, again freeing you for tasks that add value.
The role of a messaging administrator has become increasingly complicated, new technologies, new requirements, new standards and new processes. It seems we’ve become very good at over-complicating what used to be (not at the time) fairly simple. One way to simplify our current responsibilities is to cut out everything that doesn’t help us, our environment and our organization grow. Automating monitoring, reporting and SLA management can free up what little valuable time we have, and hopefully in the process, keep us out of the boss’s office.