Avoiding common Office 365 network configuration mistakes
Let's take a look at DNS configuration.
We worked with a customer that had trouble accessing Office 365 in his European domain.
After comparing performance across multiple locations and acknowledging that the issue was only happening in Europe, we started to dig into their network configuration.
Where was the data going? Where was it leaving the network? How was it processed on the Microsoft backbone?
We started to realize that the issue was not the European access point, nor their local network, but that the data was crossing the US before going off to Microsoft backbone.
Specifically, it was going to the northwest of United States.
We continued digging and found out that the customer had an internal DNS pointing to the Google DNS; and you can imagine that from Europe, this DNS would not provide very good performances.
When European clients traverse the web through US west coast DNS to retrieve their emails, their connection performance took a nosedive.
In order to show how this degrades performance, we configured the same situation and checked the end-user experience with our Robot Users.
We configured the Robot User to force an IP address resolve in the US, then send the traffic back to EMEA, to where it would be sent back to the US again before finally being sent to Microsoft.
So we had that Robot User on the test we did previously on the headquarters entry point in order to see how bad it would impact the results.
You can see more information about the test protocols if you read this article >>
Let’s take a look at the data.
You can see here the results with the yellow Robot.
The average time for any action is much worse than the others; the number of spikes that lead to potential tickets is a lot higher than any of the 3 other Robot Users that we configured earlier.
To know more about to interpret the statistics we display in PowerBI dashboards please read this article >>
The more distance you put between your user and the Microsoft tenant the more impactful performance issues you will experience.
This is an extreme case, but it shows how the network configuration can truly affect the end-user experience.
In the same way, VPN can also impact end-user performance.
Connecting with a VPN can greatly impact performance, depending on the protocol, its settings, and especially the gateway settings.
Once you’re connected to the VPN, if you connect from Europe, even to connect to a US tenant, it will have a great impact because you are actually bouncing off from different location due to VPN masking.
Speaking about Skype for Business for example, you should know that its traffic is already encrypted with TLS. Media workloads are encrypted with Secure Real Time Protocol.
So the use of a VPN will just add another encryption, additional network hops and finally greatly affects the end-user performance increasing Jitter, packet loss and latency.
Once again, the configuration of your entry point is what really matters here. You need to go as fast as you can on the Microsoft Network and eliminate unnecessary layer of infrastructure and security if you want to receive the best end-user experience.
The case of the authenticated proxy
In the same way, here is another Network configuration that we faced multiple times with customers who had connectivity issues because of it.
Many of our customers had issues after installing a proxy to connect to Office 365. What seems to be a good idea quickly transformed into a source of additional costs.
We reproduced the situation with our Robot Users to provide data on that proxy related issue.
End-User experience comparison with authenticated proxy
The dashboard shows our free/ busy tests done on multiple Robot Users. It is interesting to understand who is doing what here. Let’s focus on the Square graph which compares the time taken to complete this Free / Busy action.
The first square (grey) represents the results of the Robot User from Philadelphia connecting to the European tenant. The second one (black) is a Bangalore Robot User that had bandwidth issues and is connecting to the European tenant.
What is interesting here is the third one (bottom left- green). This is a Robot User sitting in Philadelphia connecting to a US tenant but the connection to the tenant is configured to redirect every outgoing connection to an authenticated proxy first.
The authenticated proxy server simply requires clients to authenticate before going to Office 365.
What is noticeable here is that this US Robot User connecting to a US tenant performs worse than the top red square representing a Robot User sitting in Bangalore connecting to the US tenant.
This is a very good example of how, with a simple network equipment in between Office 365 and your tenant, you can drastically degrade the end-user performance.
Just putting simple equipment like a proxy can really affect the performance delivered to end-users, up to the point where it will cost a lot in term of support and overall productivity.
After this set of tests, we came to the conclusion that there are several fundamental characteristics needed to be able to manage the end-user experience properly.
First you need to know your own network configuration and measure the end-user experience constantly; and you need those metrics because you need to analyze the data in order to understand what is going on.
Second, you need to have multiple points of connectivity for comparison, meaning multiple Robot Users in order to understand if issues are affecting one or multiple locations or your entire tenant.
Then, you need to measure the end-user experience at the actions level. We’ve clearly shown that a normal behavior of an open mailbox doesn’t mean anything unlike other actions such as the free / busy lookup or create meeting process, for example.
Finally, you need to understand how the network configuration is executed, where the data from comes, and how it goes to Microsoft. The route of the data and the way equipment is configured can dramatically affect the end-user experience and drastically increase your support costs.
That is why it is critical to monitor this end-user experience before, during and after your network changes to get facts and results based on those network changes.