IT applications are built around the end-user experience, and as moreGSX customers transition from on-premises to hybrid service delivery, we’ve faced many latency issues that have negatively affected the end-user. However, ensuring that our infrastructure provides the best end-user experience, when we don’t always have the root cause, is really dependent on the network.
Therefore, is it necessary to take a network-centric approach to monitoring the end user experience– that is getting visibility of the end-user experience through the network and to the application. As a result it will become easier to troubleshoot and fix any network and configuration issues causing latency. In this post, we’ll explain how to quickly troubleshoot your network and how it impacts the end-user experience.
Users are focused on what works and how well it works. This means that if the user connection is slow, there will be complaints. Therefore, the key is to determine issues with network performance before they reach the end user. This is called “baselining,” which is a method for analyzing computer network performance that is marked by comparing current performance to some historical metric or "baseline". You measure the baseline of your “normal” network performance by constantly measuring it from your critical location to the end-point that matters, whether it be applications on premise or in the cloud. This enables you to detect (usually before the user) that a latency issue is occurring.
After you have baselined your environment, you should check the network configuration and latency of your main applications in real-time, making it easier to troubleshoot if there is an issue. You should also be alerted as soon as your users experience network performance is below normal. Once you receive the alert, you’re one step ahead of any user ticket because your users will still need time to recognize performance issues themselves. This allows you to stay ahead of network issues and troubleshoot them before they happen.
GSX’s Traceroute tool is one of the most effective means for identifying the exact location of latency before it affects network performance. To know exactly where the latency happened, the Traceroute :
- Provides you with the detailed route from the user to the application with the latency between each computer networking hop.
- Alerts you – with the right tool – in case there are any changes in the number of hops between the user and the applications.
The route between a user and an application can be very unstable. They are many components that are sometimes beyond your control and can change that route. However, it is important to know if there is a significant increase of the number of networking hops, because that would signify the potential for network configuration errors or problems with the internet provider. Here more than ever, it’s important to constantly monitor these parameters because if you just use the command when you have a customer ticket, you will almost certainly miss very important information to troubleshoot the network.
Comparing the network latency to an application with a few end points on the Internet can always help you understand exactly where the problem is in the network. You can also use the ingress point of your users (last local network point before Internet), also known as a “witness ping” to have a perfect view of the impact of the local network on an application’s latency.
It is important to know how the network impacts your end users. For that, there are a few performance counters and aggregates that you have to constantly measure:
- Packet Loss: measures how many packets are lost during transmission. The packet loss rate is measured as a percentage.
- MOS: measures the network's impact on the listening quality of the VoIP (Voice over Internet Protocol) conversation. The network MOS rating ranges from 1 to 5, with 1 being the poorest quality and 5 being the highest quality.
- Jitter: measures the variation in arrival times of packets being received. Inter arrival jitter is measured in milliseconds (ms).
All these performance counters are crucial to understanding how the network really impacts the end-user experience. We see that the key to troubleshooting network issues is to constantly measure network latency from where users are, in order to detect, troubleshoot and fix issues before you get overwhelmed by users complaints.
We have developed our GSX Robot User to help measure potential network issues.
The GSX Robot User is a Windows service that can run on any Windows machine, and automatically does everything we’ve outlined in this blog:
- It constantly measures the network latency between where your users are and the applications you want to monitor on premise or in the cloud.
- It provides you with a baseline of the network experience allowing you to be alerted as soon as the performance is out of its normal range.
- It constantly checks the route to the applications, alerting in case of anomalous number of hops.
- It constantly performs Traceroute to provide you with a clear analysis of where the latency was at the moment it started.
- It constantly gathers the key performance counter you need to determine if the network is impacting your application’s performance.
- It tests the main route cause of failure such as the DNS availability and resolution time and the port connectivity to the application.
You can deploy, manage and see all your Robot Users on a single dashboard that provides you with a clear picture of the network performance that is provided to your end users.
At the core of GSX's mission is end-user experience monitoring. GSX provides out-of-the box monitoring to ensure all business-critical applications are performing the way they should at all times, ensuring smooth and uninterrupted service delivery. Read more >>