GSX Solutions is the specialist of Office 365 services performance optimization. We’ve worked with many enterprise customers measuring their end-user experience thanks to our Robot Users. We have the expertise, want to share the knowledge with you and explain real business cases that we faced with our customers.
Context of the Company
- Multinational pharmaceutical company, headquartered in Germany.
- It has branches in more than 100 countries and multiple center of research spread across the 5 continents.
- Migration to Exchange Online, SharePoint Online and OneDrive.
- The use of SharePoint Online or OneDrive depends on the office's activity and the type of usage of the central document management system.
- Information accuracy and availability are highly critical for the company.
The primary need
The whole project is involving most of the business lines, the Office 365 team and the IT management directly. Continuity of service is key; especially during this migration, planned to happen during the summer 2016 to limit the risks.
Unfortunately, starting mid-July and despite the high number of employee being on vacation, the number of tickets opened by end-users complaining about unavailability of the SharePoint environment or slowness of its main use (upload and download documents) increased considerably.
The problem with most of the tickets was that it was very difficult to understand if they correspond to a real issue. Major part of the “slow” situation was happening only per instant and was nearly impossible to troubleshoot. Nothing was set up to track end-user experience and relate it to network performance.
Going to a cloud solution, the management knew that performance could become an issue in some region. However, they had no way to diagnose or prevent it. Tickets were raised both on OneDrive and SharePoint Online environment.
We met during Microsoft Ignite 2016 and decided to do a Proof Of Concept (POC) with our solution shortly after. The objective was to understand if they could reduce their tickets with a proper monitoring of the end-user experience delivered to each critical branch offices.
Summary of the need:
- Understand the end-user experience pattern
- Detect any potential bottleneck on the way to the cloud
- Compare the performance with their SharePoint on-premises solution to acknowledge the presence of a gap with the Online solution
- Understand the performance tickets that were opened
How GSX was put in place
We immediately started the POC by installing the core monitoring to test the SharePoint on-premises environment.
GSX experts configured GSX Gizmo for Office 365 to:
- Check the performance of the SharePoint sites on-premises.
- Perform end-user scenarios such as upload document, download documents, create site and search services.
That would establish a baseline of the current performance with the on-premises environment.
At the same time, the GSX team worked with the customer's Office 365 team to setup 5 Robot Users for the POC that would constantly check the availability of SharePoint Online and OneDrive.
The first chosen location was headquartered in Germany (Frankfurt), where the minor number of tickets were coming from. That site seemed to run just fine while India, China and even Spain offices experienced many issues. So, it was decided to install a Robot User in Shanghai, one in Madrid, one in Bangalore and the last one in another important location in Switzerland (Basel).
On each Robot User, SharePoint Online and OneDrive test were configured to login, upload and download a document every 5 minutes. On top of that, every possible network tests were also configured in parallel to provide data and help the route cause analysis of all these tickets. Round trip ping to the proxy and to Office 365 datacenter, analysis of packet loss, DNS resolutions time and availability and port connectivity checks were running.
Alerts were fine tuned to be extremely sensitive to end-user issue that enable us to quickly detect any performance problem. We managed to detect many of them during the first week of the POC.
All data were then centralized and extracted into PowerBI reports to show the end-user performance pattern to the management and to each person in charge of the local network of the critical branch offices. The POC lasted one month, with a deep involvement of GSX experts to assist the Office 365 team to analyze the tickets, and understand how to fix them and then to prevent them.
The outcome for the company
The outcome for the company was interesting. The best evidence is that they then decided to expand the use of the Robot User to several other branch offices.
So, what did we find?
The first thing is that, comparing the performance on-premises and Online didn’t really explain anything about the situation. As a matter of facts, both end-user experience were comparable. Then moving Online was not the root cause of the issue and the service provided by Microsoft was in fact more reliable (less interruption) than the previous on-premises one.
The second outcome was to finally understand whether people were complaining for a good reason or not. In a first phase, the team decided to only focus on the tickets opened by end-users and not on the alerts that GSX Gizmo was generating. The purpose was to measure the end-user performance and network statistics upon the ticket creation and compare them with a “normal” situation thanks to the data stored and displayed in PowerBI.
It quickly showed that the performance issue was really happening but only for very short period of time. On top of that, the traceroute analysis provided by the Robot User were showing an important change in the number of hops to the cloud at that moment. The packet loss was also extremely unstable on the network.
China and Spain had the same problems. For Bangalore, it was quickly obvious that the internet connection itself was causing major issue in the use of SharePoint Online and OneDrive. With these findings our customer did a general revision of the local equipment in the chinese and spanish offices to improve their reliability and performance.
The Indian office had to change its Internet provider contract to allow way more bandwidth than before. This reduced the amount of tickets after a few weeks only.
The second phase was then to understand every GSX alerts and work on them immediately, while creating a new process to alert the branch offices that issues were identified and that the IT was working on it.
Thanks to GSX’s unbiased data, the company set up a simple process to alert support before users complain. It contributes to reduce drastically the number of tickets opened by the branch offices. Finally, the percentage of ticket reductions after two months of the solution in production was approaching 60%.
As a result, the company decided to keep GSX Gizmo, increasing the number of Robot Users to 20. They don’t monitor every critical location but rather use them mainly for troubleshooting. When a branch office repeatedly experiences end-user complaints and tickets opening, the SharePoint project team install a Robot User there to analyze the situation and improve it. Once the situation gets better, they move the Robot User to another location in need.