Optimizing SYN vs SACK Probing Methods to Avoid Unexplainable Packet Loss
Last updated
Last updated
When running Cloud and Enterprise Agent network tests through firewalls, a common issue is the “background noise” of packet loss between the agent and target without a clear cause.
An example of this is shown in the image below. There is a constant 4% packet loss, but none of the intermediate nodes in the path visualization show any packet loss that contributes to it. This article explains the cause and possible solutions for such noise.
Two symptoms are clear indicators that the perceived packet loss is noise, rather than actual loss that should be investigated:
The loss can’t be found in the path visualization: If there is a node in the network causing the loss, you should investigate that first.
The packet loss is constant: In the example above we see that there is a constant 4% packet loss. In real life situations, this usually doesn’t happen; there are situations where a circuit consistently experiences loss, but usually not at such a constant rate.
If both symptoms are present, then the loss is likely background noise, and following the steps in this article will provide a good chance of solving the problem.
A network level test in ThousandEyes consists of two parts: An end-to-end measurement, which determines the packet loss and latency we see in the graph above, and the path trace, which builds the path visualization. For full details, see Network Tests Explained.
The default method for end-to-end probing uses SACK-based measurement. This is the lightest-weight method on the network, the test target, and any firewalls in between.
Unfortunately, with some firewall settings, a false positive can cause some of the packets in the end-to-end measurements to be dropped, resulting in the background noise.
ThousandEyes recommends ensuring that your tests are as clean as possible when bringing them into production. This ensures that a real event is not missed because an operator thinks it's normal (for example, during implementation it might be known that the 4% loss is normal, but this kind of knowledge is easily lost).
SACK-based measurements are the default for a reason. So the first step is to try to resolve this issue without changing the probing method.
The first thing to check is if there is a firewall within your control that is causing these issues. Work with your firewall administrator if there are any packet drops from or to the agent seeing this behavior, or if there is an advanced threat rule being triggered. Adjusting these firewall settings is the recommended solution.
However, if the solution can't be found by changing firewall settings, you can change the probing mechanism instead.
The number of sessions the agents will create will be much higher if you change the probing mechanism. For more information, see Network Tests Explained - Advanced Topics.
In normal circumstances this shouldn’t be an issue, but when there are resource constraint firewalls in between, or a large number of agents configured to run the same test, this might have adverse effects.
Always consider and test any impact this change might have on the rest of the environment before deploying in a production environment.
To change the probing mechanism, open the test configuration panel for the relevant test in the ThousandEyes web application, navigate to the Network section, and change the probing mode from prefer SACK (the default) to Force SYN.
Save the changes, and observe the test to see if the background packet loss goes away.
Let's look again at the first test. In this instance, the firewall configuration couldn’t be changed, so instead we changed the probing mechanism to “Force SYN”.
As you can see, the background noise is gone, and the only real packet loss is shown in the trace.