Alerts
The ThousandEyes platform allows you to configure highly customizable alert rules and assign them to tests, in order to highlight or be notified of events of interest. For those who want simplicity in alert configuration and management, the ThousandEyes platform ships with default alert rules configured and enabled for each test.

Notifications

Alert notifications are delivered either via email, webhooks or via a third-party integration, such as AppDynamics, PagerDuty, Slack, or ServiceNow. Recipients are configured in the alert rule's Notifications tab. Alerts will be active in the ThousandEyes platform as long as your alert rule conditions are met, but notification of the alert being active will only occur at the start of the active period. Alerts can optionally be configured to perform notification once the alert is no longer active.
For email notifications, when multiple alerts are raised simultaneously, their data will be grouped into a single email notification.
The webhooks integration permits users to send JSON-formatted alert data to a webhooks-enabled server via HTTP. The information can then be programmatically processed and subsequent actions taken automatically. For more information on configuring ThousandEyes alert rules with webhooks, see Using Webhooks.
The AppDynamics integration sends ThousandEyes alert notifications to an AppDynamics instance for a specific application. You can set up multiple integrations to the same instance, targeting different applications or severity levels. For more information on configuring ThousandEyes alert notifications with AppDynamics, see AppDynamics Integration.
The PagerDuty integration allows you to use an Escalation Policy (which defines rules for notification destinations, repeat notifications and other actions) in your PagerDuty service to receive notifications from ThousandEyes. For more information on configuring ThousandEyes alert rules with PagerDuty, see PagerDuty Integration.
The Slack integration allows alert data to be sent to a chat or instant-message application. Users can send notifications to the Slack channel of their choice. For more information on configuring ThousandEyes alert rules with Slack, see Slack Integration.
The ServiceNow integration facilitates delivery of direct notification into a ServiceNow account so it may be processed and acted upon based on workflows defined within that system. For more information configuring alert rules to send notifications directly into the ServiceNow platform, see ServiceNow Integration.

Global vs. Location Alerts

There are two types of alert condition. The first is a global alert. A global alert triggers when the explicit alert conditions are violated in a customer’s environment. For example, if you have an alert set up that triggers based on connect time being greater than 150ms or response time being greater than 100ms for the same 10% of agents 2 of 2 times in a row, the global alert will only trigger when these explicit conditions are met (i.e., when 10% of agents violate the specified conditions 2 out of 2 rounds). You can see an example of this alert condition below:
This global alert has been triggered (1), as you can see in the picture below:
The next type of alert condition is a location alert, as seen in the picture above (2, 3, and 4). A location alert consists of one of the following, based on alert type, where a single location meets the alert conditions for at least one round, regardless of the thresholds set for the global alert:
  • For Cloud and Enterprise Agent rules, the rule is based on agents.
  • For Endpoint Agent rules, the rule is based on visited sites or on Endpoint Agents.
  • For BGP rules, the rule is based on BGP monitors.
  • For device rules, the rule is based on Interfaces.
  • For Internet Insights rules, the rule is based on affected tests or on catalog providers.
Although the initial global alert above was triggered because either response time or connect time was higher than the alerting threshold for 2 out of 2 rounds for 10% of agents, additional locations will be added to this alert for violating the condition for only 1 round. For example, the San Francisco agent could have been added to the global alert for having a connect time above the threshold for only a single round, instead of the 2 out of 2 rounds required for triggering the global alert. It is important to note that location alerts trigger and clear independently from the global alert. If you see multiple locations triggered under a global alert, you cannot assume that all the listed locations met the initial alert criteria from a per-round basis. They could have been added for violating the condition for only 1 round. To verify which locations initially triggered the global alert condition, it is best is to check the test data.
It is also important to note that the only location alerts that will be displayed in the UI at the start of a global alert trigger will be the location alerts active at the time of trigger. This can lead to scenarios where a flapping agent was involved in the evaluation criteria of an alert being fired, but has since cleared before the global alert fires. For example, imagine an alert criteria that states "Any 2 agents have an error 3 out of 3 rounds." And the following occurs:
  • Agent A - meets condition in Rounds 1, 2, 3
  • Agent B - meets condition in Rounds 2 and 3
  • Agent C - meets condition in Round 1
In the scenario listed above, 2 agents meet the criteria 3 out of 3 rounds: Round 1 is Agent A and C, then Rounds 2 and 3 are Agents A and B. At the global alert trigger, only Agents A and B will be listed in the location alerts, since Agent C cleared before the global alert trigger, even though Agent C contributed to the trigger of the alert. This will only happen when the alert conditions have multiple agents that need to meet an alert criteria multiple rounds in a row.

Clearing Alerts

There are two states of an alert: triggered and cleared. An alert is triggered when the conditions set forth in the alert rule are met. This is a global alert, as described above. An alert is only cleared once the location alerts no longer meet the triggering threshold of the alert rule. This means some locations may still be active once the global alert clears. For example, if the rule only requires that 2 agents trigger, and initially 3 agents trigger, the global alert may clear even if 1 agent continues to trigger.
Keep this in mind when setting up rules, especially as it relates to proxy versus non-proxy metrics. Proxied agents can only collect proxied metrics. They will not collect non-proxy metrics such as packet loss, latency, and jitter. Thus, any active alerts using non-proxy conditions can never be cleared by proxied agents. To avoid this, we recommend separating any tests and alert rules for proxied agents from those for non-proxied agents. Tests that collect proxy metrics should only be assigned to the proxied agents and should use proxy metric-based alert rules. Tests without any proxied agents should use non-proxy metric-based alert rules.
If you run into the above situation where an alert is not clearing due to missing data or a mismatch in proxy versus non-proxy metrics, you can manually clear the alert by un-assigning the alert rule from the test and waiting for a round of data collection. This will clear the alert. You should then further refine the alert rule to match the specific criteria you need before re-assigning it to a test, based on the guidelines above.

Viewing Alerts

Current and past alerts can be viewed on the Alerts page. The Alerts page has two tabs:
  • Active Alerts: List of alerts currently active for any test within your account group.
  • Alerts History: List of alerts no longer active from tests in your account group, shown chronologically on a timeline, and with a table whose entries contain details on each alert.

Active Alerts

The Active Alerts tab shows all alerts currently active in your account group. The Active Alerts tab auto-refreshes every two minutes.
  1. 1.
    Search: Search for alerts based on the following criteria: Alert ID, Alert Rule Name, Alert Type, Test ID, Test Name, Test Type or Status. Entering text followed by the return/enter key will execute a search and display results in the table below. To filter events by more than one criteria, click either All or Any links to specify whether the table rows must match all (AND) or any (OR) of the selected criteria.
  2. 2.
    Alert Status:
    • A red colored box indicates that the alert rule is currently active for that Test.
    • A green colored box indicates that the alert was recently cleared for the test. A cleared alert will be shown under Alert History tab.
    • A grey colored box indicates that the alert rule was disabled for that Test
  3. 3.
    Alert Rule Name: Name of the alert rule currently active. Expand an alert rule for more detailed information by Agent, BGP monitor, Start/End Time, Metrics at Alert Start, Metrics at Alert End and Duration for which the alert was active.
  4. 4.
    Test Name: Name of the test for which the alert rule is currently active
  5. 5.
    Alert ID: When gathering details for an alert via ThousandEyes API, use the alert ID to reference a particular alert.

Alert History

The Alert History tab tabulates triggered alerts which are currently in "cleared" or "inactive" state or are "disabled". To interact with the Alert History page:
  1. 1.
    Date and Time slider: Input the date endpoints to view alerts active during that timespan. Click and drag on either the start or end bars and drag to the desired date. Your selection will update the From and To date and time fields automatically.
  2. 2.
    Date and Time selector: The From and To fields allow manual input of the date and time endpoints to display alerts active at that time. Clicking in the date field will both allow manual entry of dates and display a clickable calendar to select a date. Click on the calendar arrows to navigate in the current view (default is the month view). To change to a view of months in a year or a range of years, click the current title (month, year or year range) at the top-middle of the calendar. The view will cycle to the next timeframe: month -> months -> years.
  3. 3.
    Search for alerts. By entering text into the search box, you will search for alerts matching based on the following criteria: Alert ID, Alert Rule Name, Alert Type, Test ID, Test Name, Test Type or Status. Entering text followed by the return/enter key will execute a search and refine the table below. To filter events by more than one criteria, click either All or Any links to specify whether the table rows must match all (AND) or either (OR) of the selected criteria.
  4. 4.
    Alert Rule Name: Expand an alert rule for more detailed information by Agent, BGP monitor, Start/End Time, Metrics at Alert Start, Metrics at Alert End and Duration for which the alert was active.
  5. 5.
    Test Name: Name of the Test for which the alert rule was triggered.
  6. 6.
    Duration: Length of time for which the alert rule was active for that test.
  7. 7.
    Alert ID: When gathering details for an alert via ThousandEyes API, use the alert ID to reference a particular alert.

Assignment to Tests

Once you have created an alert rule it can be assigned to any test which has the Enable box checked, on the test configuration page. By default, each test has the rule "Default <test type> Rule" assigned to it, with your account's email address configured as the recipient for email notification. To add or remove rules, click the pull-down menu below the Enable box, and select or deselect rules. To create a new rule, click the Edit Alert Rules link to access the Add New Alert Rules page, and create your rule. You will then return to the test configuration page, and use the pull-down menu to assign your new rule to the test.

Rule Configuration

Each rule has a name, a series of tests against which it is enabled, a scope of locations to which the alert rule applies, Boolean criteria defining the alert conditions, and the number of locations from which the alert conditions must be met in order to trigger an alert. The rule also can include a notification mechanism, such as a list of email recipients (recipients need not be users of ThousandEyes in order to receive email notifications), a PagerDuty Service or one or more webhooks.
The image below displays the configuration options of a new alert rule.
  1. 1.
    Alert Type Layer: test layers available to your organization.
  2. 2.
    Alert Type: available alert types for the selected test layer.
  3. 3.
    Rule Name: An alphanumeric string naming this alert rule.

Settings Tab

  1. 1.
    Tests: Select tests to which this alert rule is assigned. You may choose to configure no tests with this alert rule, and assign it to tests at a later time.
  2. 2.
    Monitors, Agents: This selector will display either "Monitors" for a Routing Layer alert rule or "Agents" for all other alert rules. The selector has one of three values:
  • All: This alert rule applies to all agents or monitors for a test to which this alert rule is assigned.
  • All except: This alert rule applies to all agents or monitors for a test to which this alert rule is assigned, except for the Agents specified in the selector that will appear when "All except" value is chosen.
  • Specific: This alert rule applies only to specific agents or monitors for a test to which this alert rule is assigned. The Agents or Monitors are specified in the selector that will appear when "Specific" value is chosen. The image below displays the rest of the configuration options of a new alert rule:
  1. 1.
    Specify the number of agents, all/any of the following alerting conditions, and the number of test rounds the conditions must be met before alerting.
  2. 2.
    Sticky Agents: Select “any of” if you want an alert sent when any set of agents meet the alert condition(s) in consecutive rounds. Select “the same” if you want an alert sent only if the same set of agents meet the alert conditions(s) across multiple rounds. For example, an alert rule is configured for if the same agent trips a specified threshold in three consecutive rounds. The Atlanta cloud agent trips the rule in round one, the Ashburn cloud agent trips it in round two, and the San Francisco cloud agent trips it in round three. In this scenario, the alert rule would not trigger when using sticky agents. Either Atlanta, Ashburn, or San Francisco would need to trip the rule in three consecutive rounds to trigger the alert. In addition, keep in mind that location alerts are triggered and cleared on a single-round basis, independently of the global alert. Therefore, a location alert appearing on a rule using sticky agents does not always imply that the location was part of the set of agents that met the "same agents X out of Y times” criteria, just that the agent met the alert condition(s) at least once while the global alert was active. Note: Sticky Agents are currently only available for Cloud and Enterprise Agent alerts.
  3. 3.
    Threshold: Specify the threshold value for locations (agents, monitors, or countries, depending on rule type) that must meet the alert conditions in order to trigger this alert rule. This value will be either a number of agents/monitors/countries, or a percentage of agents/monitors/countries, as specified in the next setting.
NOTE: When a percentage of agents, monitors, or countries is used, and the percentage results in a non-whole number threshold value of actual agents, monitors, or countries, the fractional part of the value is significant. For example, when an alert rule with a threshold of 25% of all agents is applied to 13 agents, the threshold is 3.25 agents. This threshold will require 4 agents to meet the alert criteria in order to trigger the alert rule.
  1. 1.
    Threshold units: Select either agent, monitor, or country, or percentage of agents, monitors, or countries.
  2. 2.
    Rounds (met): Select the number of test rounds that the following alert condtion(s) must be met out of a total number of rounds in order to trigger the alert rule. See the Rounds (total) entry below.
  3. 3.
    Rounds (total): Select the total number of test rounds in which the Rounds (met) selection is evaluated. For example, if Rounds (met) = 2 and Rounds (total) = 3 then for every three rounds, the alert rule will trigger if the condition(s) were met twice.
  4. 4.
    Metric: Select a test metric for this condition.
  5. 5.
    Operators: The following operators are available:
  • >, <, ≥, ≤ : Numerical comparisons for greater than, less than, greater than or equal to, less than or equal to. Available for all numerical (decimal and integer) metrics, such as packet loss percentage (decimal) of Network Layer tests, or Error Count (integer) of a Page load test.
  • is, is not: Numeric comparison for values which are not continuous ranges (e.g. HTTP status codes) or to a fixed string value, such as the Error Type (e.g. "DNS", "Connect", "SSL").
  • is in, is not in: Numeric or string comparison to a list of values. For example, a BGP Routing rule compares a test metric's AS number (integer) to a list of one or more AS numbers to determine if the test metric is found or not found in the list.
  • is empty, is not empty: Determines whether a metric has a value or has no value.
  • is incomplete: Determines whether a test completed the operations for a given metric. For example, a Path Trace alert rule is used to determine whether the path trace reached its destination, or a Page Load test fully loaded a page.
  • is present: Triggered when an error condition is present.
  • matches, does not match: Determines whether the POSIX regular expression in the alert rule is found within the string produced by the test metric (i.e. a substring will produce a match). For example, an alert rule for the Error metric of an HTTP Server test with the following alert condition:
    will alert when the test's Error Details text is "SSL certificate problem: certificate has expired":
    because the regular expression "certificate\s*\w*:" matches the sub-string "certificate problem:".
The operators available per type of alert rule are also shown in the table below.
  1. 1.
    Threshold: The value that the Metric setting will be compared against, using the chosen operator. Note that some operators do not have a Value field.
  2. 2.
    Add/Delete: Click the + or - icon to add or delete alert criteria to this alert rule. Criteria can be nested for some types of alert rule.
  3. 3.
    Compatible Test Types: Test types to which this alert rule can be assigned.

DNS Server Alert Rules

DNS server tests differ from other ThousandEyes tests in that multiple servers can be explicitly targeted in a single test. As a result, DNS server alert rules are evaluated on a per-server basis. That is, for each server in the DNS Servers field of the test configuration, the alert conditions are evaluated separately from all other servers in the DNS Servers field. For example, consider an alert rule that has the following alert conditions:
When assigned to a DNS server test with two servers configured as the targets, each server will be evaluated separately against the above alert condition. To trigger the alert rule, at least four agents must receive an error against same DNS server. The alert rule would not be triggered if, for example, three agents received an error when testing the first DNS server and a fourth agent received an error when testing the second DNS server.

BGP Alert Rules

A BGP alert rule can be applied to a Routing Layer BGP test, or to a different Layer type that provides the BGP Route Visualization View. It is important to note that some alert rule conditions can be applied differently depending on which type of test the rule is assigned to. For example, a BGP test has only a single target prefix which will be evaluated against the alert Conditions. If the "Covered Prefixes" box is checked, any covered prefixes found are not evaluated against the alert Conditions except the explicit "Covered Prefix" condition.
In contrast, a non-BGP test type can have one or more targets. DNS Server tests can explicitly test multiple DNS servers. An Agent to Server target's domain name can resolve to multiple servers IP addresses. When creating the BGP Path Visualization, the Prefix selector will show these multiple target prefixes, and evaluate each prefix against any BGP alert rules assigned to the test. Thus, prefixes which would be considered covered prefixes under a BGP test and not evaluated by the alert rule (unless by a "Covered Prefix" condition) are evaluated when assigned to the non-BGP test. Similarly, the "Covered Prefix" condition does not have any relevance when assigned to a non-BGP test.
BGP alert rules have a parameter named "Prefix Length", which is used to determine the length of prefixes evaluated by the rule. The "Prefix Length" can be individually configured for IPv4 and IPv6 protocols.
The default BGP alert rule will fire when 10% of monitors have less than 100% reachability.

Notifications Tab

In addition to presenting the alert in the app.thousandeyes.com UI, the ThousandEyes platform can deliver notifications of alerts through a number of services. The image below displays the Notifications configuration options of a new alert rule.
  1. 1.
    Send emails to: A list of addresses to which an alert email will be sent when the alert rule is first triggered. Addressees need not be users of the ThousandEyes platform.
  2. 2.
    Edit emails: Click this link to add email addresses to the Notifications address book.
  3. 3.
    Send an email: Check this box to send an email when the alert rule is no longer active.
  4. 4.
    Add/Remove Message: Enter text to be added to the body of the alert rule's email notification.To prevent code injection, custom messages cannot contain words or phrases wrapped in angle brackets "<like this>"
  5. 5.
    Webhooks: Webhooks-enabled web services that receive the alert notification.
  6. 6.
    Edit webhooks: create or edit webhooks which can then be added to the Webhooks Send Notifications to field.
  7. 7.
    Integrations: integrations that should receive the alert Notification.
  8. 8.
    Edit Integrations: create or edit an integration which can then be added to the Integrations Send Notifications to field. Currently, ThousandEyes offers integrations for AppDynamics, PagerDuty, Slack and ServiceNow.
Note: Alerts are active as long as your alert rule criteria are met, but any configured email notification will only occur at the beginning of the alert.

Available Operators, Metrics and Units

The following table shows a list of test types which are available in the ThousandEyes platform, and the test metrics and operators.
Test Layer
Alert Type
Metric
Operators
Units
Network
End-to-End (Server), End-to-End (Agent)
Packet loss
≤,≥
%
Network
End-to-End (Server), End-to-End (Agent)
Latency1
≤,≥
ms
Network
End-to-End (Server), End-to-End (Agent)
Jitter
≤,≥
ms
Network
End-to-End (Server), End-to-End (Agent)
Error
is present, matches, does not match
n/a
Network
End-to-End (Agent)
Throughput
≤,≥
Kbps
Network
End-to-End (Server)
Available Bandwidth
≤,≥
Mbps
Network
End-to-End (Server)
Capacity
≤,≥
Mbps
Network
Path Trace
Delay
≤,≥
ms
Network
Path Trace
IP Address2
in, not in
IP address or prefix
Network
Path Trace
ASN2
in, not in
List of ASNs
Network
Path Trace
rDNS2
in, not in
exact hostname or wildcard-based match to domain
Network
Path Trace
MPLS Label2
is empty, is not empty
Network
Path Trace
DSCP2
is, is not
DSCP value selected from list
Network
Path Trace
Server IP
in, not in
IP address, prefix
Network
Path Trace
Server MSS
<, >
bytes
Network
Path Trace
Path MTU
<, >
bytes
Network
Path Trace
Path Length
<, >
hops
Network
Path Trace
Trace is incomplete
n/a
DNS
Server, Trace DNSSEC
Error
is present, matches, does not match
n/a
DNS
Server
Resolution time
≤,≥
ms
DNS
Server, Trace
Mapping
is not in
quoted <comma-separated list of mappings>
DNS+
Server Latency, Domain
Resolution Time
≤,≥
ms
DNS+
Domain
Availability
≤,≥
%
DNS+
Domain
Mapping
is not in
quoted <comma-separated list of mappings>
Web
HTTP Server
Response code
is
any error (≥ http/400 or no response) ok (http/200) redirect (http/300
Web
HTTP Server
Response Header
matches, does not match
Web
HTTP Server
DNS time
≤,≥
ms
Web
HTTP Server
Connect time
≤,≥
ms
Web
HTTP Server
SSL negotiation time
≤,≥
ms
Web
HTTP Server
Wait time
≤,≥
ms
Web
HTTP Server
Receive time
≤,≥
ms
Web
HTTP Server
Response time1
≤,≥
ms
Web
HTTP Server
Total Fetch Time
≤,≥
ms
Web
HTTP Server
Throughput
≤,≥
kBps
Web
HTTP Server
Error
is present, matches, does not match
n/a
Web
HTTP Server
Error type
is, is not
DNS, Connect, SSL, Send, Receive, Content, HTTP, Any
Web
HTTP Server
Client SSL Alert Code
is, is not
SSL error type. E.g., Unexpected message ( 10 ), Bad Certificate (42)
Web
HTTP Server
Server SSL Alert Code
is, is not
SSL error type. E.g., Unexpected message ( 10 ), Bad Certificate (42)
Web
Page Load
Page load
Is incomplete
n/a
Web
Page Load
Response time
≤,≥
ms
Web
Page Load
DOM load time
≤,≥
ms
Web
Page Load
Page load time1
≤,≥
ms
Web
Page Load
Error Count
≤,≥
#
Web
Page Load
Domain Name3
is in, is not in
quoted <comma-separated list of mappings>
Web
Page Load
Total Fetch Time3
≤,≥
ms
Web
Page Load
Blocked Time3
≤,≥
ms
Web
Page Load
DNS Time3
≤,≥
ms
Web
Page Load
Connect Time3
≤,≥
ms
Web
Page Load
Send Time3
≤,≥
ms
Web
Page Load
Wait Time3
≤,≥
ms
Web
Page Load
Receive Time3
≤,≥
ms
Web
Page Load
SSL Negotiation Time3
≤,≥
ms
Web
Page Load
Component Load3
is incomplete
n/a
Web
Transaction (Classic)
Error
is present
n/a
Web
Transaction (Classic)
Transaction Time
≤,≥
ms
Web
Transaction (Classic)
Completion
≤,≥
%
Web
Transaction (Classic)
Steps Completed
≤, ≥, is
#
Web
Transaction (Classic)
Any Steps meets
any, all
of the following conditions: Step Duration
Web
Transaction (Classic)
Step # meets
any, all
of the following conditions: Step Duration
Web
Transaction (Classic)
Any Page meets
any, all
of the following conditions: Page Duration
Web
Transaction (Classic)
Page # meets
any, all
of the following conditions: Step Duration
Web
Transaction
Page
URL, Host, Page #
Web
Transaction
Page/Any Page > Page Load Time
,
ms
Web
Transaction
Page/Any Page > Page Load Error
is present, matches
Web
Transaction
Page/Any Page > Response Time
,
ms
Web
Transaction
Page/Any Page > DOM Load Time
,
ms
Web
Transaction
Marker (name)
exact textual matching, case-sensitive
n/a
Web
Transaction
Marker (presence)
is present, is not present
n/a
Web
Transaction
Marker (duration)
,
ms
Web
Transaction
Assert Error
is present, matches, does not match
Web
Transaction
Transaction Time
,
ms
Web
Transaction
Transaction Completion
is finished, has error, has internal error, timed out
n/a
Web
Transaction
Error
is present, matches, does not match
Routing
BGP
Reachability
<,>
%
Routing
BGP
Path Changes
<,>
n/a
Routing
BGP
Origin ASN
is in, is not in
comma-separated list of ASNs.
Routing
BGP
Next Hop ASN
is in, is not in
comma-separated list of ASNs.
Routing
BGP
Prefix
is in, is not in
comma-separated list of covered prefixes
Routing
BGP
Covered Prefix4
exists, is in, is not in
comma-separated list of sub-prefixes
Voice
RTP Stream
Error
is present, matches, does not match
n/a
Voice
RTP Stream
MOS
≤,≥
#
Voice
RTP Stream
Packet loss
≤,≥
%
Voice
RTP Stream
Discards
≤,≥
%
Voice
RTP Stream
DSCP
is, is not
DSCP Values. E.g., Best Effort (0), Expedited Forwarding (46)
Voice
RTP Stream
Latency
≤,≥
ms
Voice
RTP Stream
Packet Delay Variation
≤,≥
ms
  1. 1.
    For some metrics, dynamic baselines can be configured. See the Dynamic Baselines section for more information.
  2. 2.
    These metrics are configurable under the "Any Hop", "Last Hop", or "Hop #" entries in Path Trace alert rules. Select "Any or "All" for multiple sub-conditions.
  3. 3.
    These metrics are accessed under the "Any Component" alert condition in Page Load Tests. Select "Any or "All" for multiple sub-conditions.
  4. 4.
    Only BGP Routing tests provide Covered Prefix data. Do not assign a BGP alert rule with a Covered Prefix metric to a non-BGP test type that has BGP Path Visualization measurements enabled. For non-BGP test types, use an alert rule that does not include the Covered Prefix metric, and if needed create a separate BGP test and an a separate alert rule with the Covered Prefix metric.
Each metric from the table above is defined in the article ThousandMetrics: What Do Your Results Mean?

Default Alerting Rules

Default alert rules are defined according to the following list. Within the account group, default alert rules can be changed by any user having a role with the View alert rules and Edit alert rules permissions, such as the built-in Account Admin or Organization Admin roles. Default rules can be configured with zero or more alert rules representing the default alert rule for each type.
Name
Criteria
Minimum Locations
Default Network Alert Rule
Packet loss ≥ 20%
2 locations
Default DNS Trace Alert Rule
Error is present
2 locations
Default DNS Server Alert Rule
Error is present
2 locations
Default DNSSEC Alert Rule
Error is present
2 locations
Default DNS+ Domain Alert Rule
Availability ≤ 90% and Reference Availability ≥ 90%
2 countries
Default DNS+ Server Alert Rule
Resolution time ≥ 100ms
1 country
Default HTTP Alert Rule
Error type is any
2 locations
Default Page Load Alert Rule
Page load is incomplete
2 locations
Default Transaction Alert Rule
Error is present
2 locations
Default BGP Alert Rule
Reachability < 100%
10% of locations
Default Voice Alert Rule
Error is present
1 location

Dynamic Baselines

Dynamic baselines allow users to create alerts that more accurately reflect the natural variance in test data. Using standard deviation, percentage change, or absolute values, users can configure alerts that dynamically determine whether to fire or not, based on historical data within a sliding time window.
Note: Dynamic baselines are currently only available for Cloud and Enterprise Agent alerts.
Let's imagine a scenario where a HTTP server test runs every fifteen minutes. Over the course of the first hour, four tests are run by an agent in New York. It gathers response times of 510ms, 490ms, 550ms, and 450ms, for an average of 500ms, and so far, the alert has not fired.
The alert uses a dynamic baseline, and has a two-hour window. Based on the four results so far, whether it will fire or not for the next test depends on whether it was configured using standard deviation, percentage change, or an absolute value:
  • The standard deviation (STDEV) for these results is 36. Using the default multiplier, the alert would fire if the next test returned a response time greater than 500+(36x2) = 572ms.
  • The percentage change would need to be at least 10% to have avoided firing until now. With an average of 500ms, the alert would now fire if the next test returned a response time greater than 500+10% = 550ms.
  • The absolute value needs to be at least 50ms for the alert to have not fired (the third value, 550, is 50 more than the average of the first two test results). The alert would therefore only fire if the next test returned a response time of 500+50 = 550ms.
    In this example, alert rules using both the percentage change and absolute values would fire at the same point (551ms or longer), while alert rules using standard deviation would not fire until 573ms.
Now let’s add two more results - 482ms and 464ms. All six results are within the two-hour window, which changes the average or baseline to 491ms, as well as changing when the alert fires:
  • The STDEV for the six results is 32.5, meaning that the alert would fire if the next test response time was greater than 491+(32.5*2) = 556ms.
  • The percentage change remains 10%, meaning that the alert would fire if the next test response time was greater than 491+10% = 540ms.
  • The absolute value remains 50ms, meaning that the alert would fire if the next test response time was greater than 491+50 = 541ms.
The different options allow users to adapt their alerting framework to better reflect the fluctuation in test results, and ensure that their system isn’t overwhelmed with alerts because of static metric baselines.
The following metrics currently support dynamic baselines:
  • Web / HTTP server / Response Time
  • Web / Page Load / Page Load Time
  • Network / End to End (Server) / Latency
The image below shows an example alert configuration using a dynamic baseline. The alert condition states that if the response time exceeds two standard deviations above the average value over the last four hours, the alert will fire.
Important Notes:
The time window for the alert must be at least three times the length of the interval of any tests it is attached to, in order to fire. For example, if a test runs every five minutes, the time window for the alert must be at least fifteen minutes in order to gather the three data points required.
Dynamic baseline alerts that are based on standard deviation can be very noisy for metrics with a small or very stable average. For example, the standard deviation of latency for a service could be less than 1ms. If your service jumped from 20ms to 20.4ms, this isn't inherently cause for concern, but with a sensitive dynamic baseline alert set up, this alert could fire regularly and increase noise. ThousandEyes advises adding an additional alert condition with an absolute difference from average when adding a standard deviation dynamic baseline alert. For example, you can add a condition that says "latency > 5ms above the mean of last 1 hour." This will ensure that your alert will only fire if it is above the standard deviation and above a certain absolute threshold vs. your average.

Additional Information

Cloud Agents displaying a Local Problems message on a test results page are excluded from alert calculations:
This is the equivalent of having the alert rule's Agents field set to "All agents except" the Cloud Agent with the Local Problems message.
Last modified 1mo ago