This document explains each of the metrics that the ThousandEyes platform captures. These metrics are used across the platform, and may be of interest to those who are responsible for the definition of alert rules.
These metrics apply to scheduled tests run on Cloud and Enterprise Agents. For information on Endpoint Agent metrics, see the article Data Collected by the Endpoint Agent.
Loss: End-to-end packet loss. The percentage of packets lost is calculated by subtracting the number of reply packets the agent receives from the target (responses) from the number of packets sent by the agent, then dividing by the number of packets sent, then multiplying by 100.
Latency: The average of the round-trip packet time. Round-trip packet time is the time from which a packet is sent by the agent to the time the agent receives a reply.
Jitter: The standard deviation of latency. The standard deviation indicates how widely spread are the measurements around the average. A larger standard deviation indicates a wider spread of the measurements.
Available Bandwidth: Total available bandwidth between source and destination measured in Mbps.
Capacity: This metric is present in the Available Bandwidth table. It represents an estimation of the link capacity in Mbps, also used to measure the available bandwidth as a subsequent step.
Avg. Response (node property): Time-to-respond, averaged across all probe packet response timings.
Delay and Min. Delay (link properties): Estimated minimum transmission delay across a given link. The value is calculated by finding the agent that reported the lowest response time for the node on the right-hand side of the link (the node further from the agent), then subtracting the same agent's lowest detected response time for the node on the left-hand side of the link (the node closer to the agent). When a single path trace from a single agent traverses the link, this metric is called Delay.
No. of Traces (link property): The number of path traces that traversed a particular network path between two nodes. Expressed as X of Y, where Y is the number of all path traces across all agents presented in the test result.
Reachability: The percentage of time during a round (15 minutes) in which the BGP router had a route to reach the destination prefix.
Path Changes: The number of AS path changes during a round. If a route is withdrawn and re-announced, it counts as 2 changes.
Updates: The plain count of BGP updates during a round of measurements. This number should always be strictly greater than or equal to the number of path changes.
Availability: The DNS server test targets a name resolution to a specific set of DNS servers. Availability can either be per server (in which case it's 0% or 100% for a given agent) or average across all the servers. If the name server does not provide a resource record for the specified domain and type, then it is considered an error (availability<100%).
Resolution Time: The time it takes to query a specific name server for the given domain.
Availability: The trace test does a resolution over the entire delegation chain for a given domain, similar to "dig +trace". Availability is 100% if the agent was able to resolve the given domain name, and 0% if an error was encountered at some point during the test.
Queries: The number of queries used to produce the trace.
Final Query Time: The resolution time of the very last query in the trace.
Availability: A resolution is successful if the open resolver answers with a resource record for the given domain (availability=100%), otherwise the resolver is marked with an error that can be either "No mapping", "NXDOMAIN", "SERVFAIL", or "Truncated". The Availability is typically shown in an aggregated format (per country or per network).
Vantage Points: The number of open DNS resolvers used to collect the data.
Resolution Time: The time it takes for the DNS resolver to perform a DNS resolution to the very last name server in the chain. As with the previous metric, this metric is shown in an aggregated format, either per country or per network.
Mappings: A DNS resource record is a mapping of one value to one or more other values. For example, the A record for "google-public-dns-a.google.com" maps to the IP address 188.8.131.52 and the A record for "google-public-dns-b.google.com" maps to the IP address 184.108.40.206.
% of Vantage Points: The fraction of resolvers that reply to the DNS query with a specific resource record. As with all DNS+ metrics, this metric is aggregated either per country or network. For example, a weight of 6.9% for the mapping 220.127.116.11 coming from querying www.facebook.com A in the United States means that 6.9% of the resolvers that were queried for www.facebook.com type A had the resource record 18.104.22.168 in the response.
Latency: The time it takes for a DNS resolver to perform name resolution to a designated name server. As with previous metrics, this metric is shown in an aggregated format, either per country or per network.
Availability: The Availability for a given agent should be 100% if the HTTP status code is 2xx or 3xx, and 0% otherwise. The average Availability can take any value from 0% to 100%.
Response Code: The HTTP status code returned by the agent when fetching the URL. The test follows up to a maximum of 10 redirects in a row, which is indicated in the table as "Number of redirects". See this w3.org page for definitions of HTTP response codes.
Response Time: Also known as time-to-first-byte, this is the time from the beginning of the request (before DNS request) until the client receives the first byte of the response from the web server.
DNS Time: The time required for the agent to perform a DNS resolution of the hostname in the URL.
Connect Time: The time required to establish a TCP connection with the web server.
SSL Negotiation Time: The time required to negotiate SSL/TLS.
Wait Time: The time elapsed between the completion of sending the HTTP request and the time the agent receives the first byte of the response from the web server.
Receive Time: The time elapsed receiving the response from the server (time from first byte to last byte of payload)
Throughput (MB/s): The throughput, measured in megabytes per second. This is the Wire Size divided by the Receive Time.
Wire Size: The size of the object while in transmission. Often, HTTP responses are compressed prior to transmission, so wire size may be less than actual size if the object is compressed by the server.
Total Time: This is the sum of response time and receive time. In other words: DNS time + Connect time + SSL negotiation time + Wait time + receive time.
Response time: Also known as time-to-first-byte, this is the time from the beginning of the request (before DNS request) until the client receives the first byte of the response from the web server.
Total Wire Size (kB): The size of all the objects in the page while in transmission. Often, HTTP responses are compressed prior to transmission, so wire size may be less than actual size if any objects are compressed by the server.
Throughput (kbps): The throughput, measured in kilobits per second. This is the Total Wire Size divided by the sum of the Receive Times for each object (taking into account that some of the downloads are overlapping). For example, if object A has an overlap of 50% with object B and they both take T time to download, the time that is considered for Throughput computation is 1.5*T.
Component Errors: The number of component errors in the page. For example, an image that fails to load.
DOM Load Time (ms): Also known as time-to-interaction, the time required for the browser to build the Document Object Model for the page, which is the skeleton of the page, not including images. This maps to the DOMContentLoaded event.
Page Load Time (ms): This maps to the load event triggered when the web page is fully loaded. Generally, the Page Load time is higher than the DOM Load time.
Blocking: The time an HTTP request waits for system resources to become available that are needed to make a connection to a web server. The most common reason for significant blocking time is the per-domain concurrent connection limit.
DNS: The time required for the agent to perform a DNS resolution of the domain name in the request for the object.
Connect: The time required to establish a TCP connection with the web server.
SSL: The time required to negotiate SSL/TLS.
Send: The time required to send the HTTP request to the web server.
Wait: The time elapsed between the completion of sending the HTTP request and the time the agent receives the first byte of the response from the web server.
Receive: The time elapsed in receiving the response from the server (time from first byte to last byte of payload).
Completion: Each web transaction has an initial and final step that can be configured by the user. Completion refers to the fraction of transaction steps that are executed between the initial and final steps.
Transaction Time: The time taken to execute the transaction, counting from the initial step until the final step.
Throughput: The throughput, measured in megabits per second (Mbps). This is the Wire Size divided by the Transfer Time.
Wire Size: The size in bytes of the data transferred by FTP while in transmission. It is possible that an FTP server can compress data prior to transmission, so wire size may be less than original file size if the object is compressed by the server.
DNS Time: The time required for the agent to perform a DNS resolution of the domain name in the request for the object.
Connect Time: The time required to establish a TCP connection with the FTP server.
Negotiation Time: Beginning after Connect, up to the time the FTP upload, download or list command is issued (includes login).
Wait Time: Beginning after the FTP upload, download or list command is issued, until the first byte of the transfer..
Transfer Time: The time taken to transmit all bytes of the transfer.
Total Time: The sum of the previous timings.
Response Time: The sum of DNS Time, Connect Time, Negotiation Time and Wait time.
Availability: Percentage of time that the server responds to a request with a successful status code. By default, successful status codes are 2xx/3xx, but can be set to other status codes.
Response Time: Also known as time-to-first-byte, response time is the time elapsed from the beginning of the request (before DNS resolution) until the client receives the first byte of the response from the server. The Response Time metric is accompanied by the individual timings that comprise Response Time: DNS Time, Connect Time and Wait Time.
Total Time: Total Time represents time spent performing all phases of the test. The Total Time metric is accompanied by the individual timings that comprise Response Time: DNS Time, Connect Time, Redirect Time, Register Time (if SIP registration is performed) and Options Time.
Mean Opinion Score (MOS): A number indicative of the perceived voice call quality. MOS being highly subjective, individual transmission parameters are transformed into “impairment factors” such as codec characteristics, delay, loss/discard ratio to obtain the R Factor. The following table relates user opinion MOS and R Factor:
Loss: A percentage of a stream of UDP encapsulated RTP packets that do not reach the destination.
Discards: A percentage of the packets that are delayed over the network and are no longer required when they reach the destination.
Latency: The average time taken by the packets to reach the destination.
Packet Delay Variation (PDV): The average variation in unidirectional delay for packets reaching the destination.
Receiving IP: Source IP address of RTP stream as received at target agent.
Target IP: Target IP address of RTP stream as sent by source agent.