Getting Started with Internet Insights
Internet Insights is a service that detects major, widespread network and application outages across the global Internet. In this guide, you will learn to:
- Identify and triage outages that are not caught by your own tests
- Add context around the outages that are caught by your tests
- Understand the historical reliability of one or more service providers
Why should you use Internet Insights? When a critical service is disrupted, it's common to wonder if you're the only one affected by the outage or if the issue is larger in scope or scale. Internet Insights gives you visibility into the networks and SaaS applications you depend on. Internet Insights is built upon ThousandEyes’ collective data set -- billions of probes across the Internet to websites, apps, and API endpoints every day -- combined with algorithmic outage detection to provide a macro-scale view into network and application outages. The intelligence derived from this data enables operations teams to quickly identify and resolve issues with providers using concrete Internet telemetry data.
To use Internet Insights, your organization must have purchased one or more package licenses. To check if your organization is licensed for Internet Insights, either:
- Navigate to the Account Settings > Usage and Billing page and see the Internet Insights Package Licenses in the Plan Usage section, or
- If your user account does not have permissions to view the Usage and Billing page, you can look for Internet Insights in your navigation menu.
- If the Internet Insights menu item does not have any sub-items, then you currently do not have a license for Internet Insights.
- If the Internet Insights menu item does contain sub-items, including Overview, Views, and Catalog Settings, then your organization is licensed for Internet Insights.
Internet Insights outages display as outage events. Outage events can have a variety of causes. Here are just a few examples:
- Failures of physical infrastructure, such as a major cable cut or loss of power at an internet exchange facility or a data center
- Failures of internet infrastructure due to configuration typos, unscheduled maintenance, or political interference
- Distributed denial-of-service (DDoS) attacks
Internet Insights detects outage events by analyzing the network-layer and application-layer results of every test that is run from ThousandEyes Cloud Agents or Enterprise Agents.
- A Network Outage is triggered when a concentration of packet loss events is detected within a single network point of presence (PoP) within a short period of time.
- An Application Outage is triggered when some or all servers that belong to the same application are failing, such as the application not responding to requests or responding with failure status codes.
It's worth reading the entire article on Configuring Internet Insights – you'll learn about packages and providers, and how to choose among them based on your business requirements or on critical business services.
The Internet Insights catalog is a collection of service providers grouped into packages, and categorized by geographic region and provider type. Each Internet Insights package license allows you to activate one package from the catalog. For example, if you host infrastructure or workloads in public cloud providers’ North American data centers, or you depend on services that are hosted in those environments, you should activate the North American IAAS Providers package for visibility and outage detection into AWS, Azure, GCP, and others. You can see the complete catalog and the currently activated packages from the Internet Insights > Catalog Settings page. To view the specific providers included within a package, click the row for that package in the Active Packages list. This opens the Coverage Map dialog which shows the providers and the locations where Internet Insights has visibility coverage.
Provider labels can be used to filter outages in the Internet Insights views, alert rules, and dashboard widgets. To create a provider label, navigate to the Internet Insights > Catalog Settings > Labels tab and click Add New Label. To learn more, read the Provider Labels article.
To start using Internet Insights, you’ll need to activate one or more packages. Without any active packages, you’ll still see limited data – but only for your own Cloud and Enterprise Agent tests. To effectively use the visualizations, dashboards, and alerts, ensure that you allocate all of your available licenses by activating packages in the catalog.
To activate a package for Internet Insights when you have available licenses for it:
- 1.Go to Internet Insights > Catalog Settings screen and click the Packages tab.
- 2.Verify that the Available counter shows one or more licenses.
- 3.Find the row with the package that you want to add.
- 4.In the Included column, click the Active slider to add the package.
To activate a package when you have no available licenses, you must first deactivate a package, then activate the desired package in its place. To deactivate an Internet Insights package:
- 1.Go to Internet Insights > Catalog Settings screen and click the Packages tab.
- 2.Find the row with the package that you want to remove.
- 3.In the Included column, click the Active slider to remove the package.
The following sections describe the functionality of Internet Insights screens at a high level. For complete details about these screens, see the Internet Insights Screens article and its sub-articles.
For network outages, the Internet Insights > Views > Topology tab shows a network path visualization with traffic sources, target destinations, and the Internet hops between them. Sources are shown on the left, and destinations are shown on the right. The center of the visualization shows the interfaces where the outage is occurring. By default, interfaces are grouped by Autonomous System Number (ASN); click on the interface group in the topology visualization to drill down by location or IP address.
The short video clip below shows a network outage Topology tab, beginning with the default grouping and filters and then drilling down to the specific interfaces.
- The Topology tab shows the flow from agents to the application. The application can be visualized by ASN, network prefix, location, or domain. Agents can be visualized by ASN or location. Use this tab to visualize the scope of the outage.
- The Map tab shows the geographical scope of the selected outage, along with summarized outage information.
- The Table tab includes detailed information on the types of application errors, as shown below.
The following sections describe some of the ways you can use Internet Insights, including:
- Outage detection and triage
- Adding detail to Internet Insights outages with Cloud and Enterprise Agents: “macro to micro"
- Adding context to Cloud and Enterprise Agent tests with Internet Insights: “micro to macro"
Triaging is useful when you want to understand the impact and scope that an Internet outage is having on your organization. One common workflow is using Internet Insights for real-time outage detection and triaging. For example, some customers display the Internet Insights Overview screen on a large monitor in a 24/7/365 network operations center (NOC).
Navigate to Internet Insights > Overview. You should get started with the filters below and adjust them to fit your specific needs:
- Outage Type: All
- Outage Scope: All
- Affected Provider: All
- Last 30 minutes
The screenshot below of the Overview shows an outage that has occurred in the last few minutes. To begin triaging the outage, click on the provider name in the Outage Events column, or hover over the outage indicator on the map and click the provider name. This will open the Network Outages or Application Outages view based on the type of outage.
In the view, scroll down to the Topology to quickly ascertain the scope and impact of the outage. First, consider the affected source locations (on the left of the topology) and affected destinations (on the right of the topology). If you do not have any stakeholders located in these areas and you do not depend on the affected services, then the outage likely has little to no impact on your operations.
If you do have users in the affected locations, and you depend on one or more of the affected services, then you should investigate the outage more closely. Hover over the affected interfaces (center of the topology) to show the outage details popover which indicates how many, if any, of your Cloud and Enterprise Agent tests are affected by this outage.
Be aware that the absence of affected tests does not necessarily mean the outage is not impacting you or your users. Unless you are certain you have created and enabled Cloud and Enterprise Agent tests that target the affected destination, you may have a gap in visibility, i.e, you have no tests which would have been affected by the outage.
If the affected destination is a critical service, or when the geographic scope is large, then the outage warrants deeper investigation. You may want to create new tests, or increase the frequency of existing tests, to ensure you have visibility coverage.
Internet Insights, the “macro-level" view, provides context to outages, but less detail than individual Cloud and Enterprise Agent tests and their “micro-level" view. Starting from the Internet Insights view, when you do have one or more affected tests for a given outage, the destination nodes in the topology will display a small yellow circle. Hover over the destination node to see the specific tests.
In the details popover, click the test name to open the test view in a new tab or window.
You can also see affected tests, if any, in the Map and Table tabs.
When opening an affected test from a Network Outage view, the test view will default to the Network > Path Visualization layer. Look for these two key interface elements when viewing a test affected by an outage:
- The purple outage swimlane below the timeline, which indicates the specific test rounds in which the test was affected by an outage
- The Outage Detected button in the Path Visualization panel
Click the [#] nodes link inside the Outage Detected button to quickly highlight the nodes that are affected by the outage. Then look for the node(s) in the path visualization with a dashed red outline, like the one shown in the screenshot below.
Hover over the affected node and click the Show only agents using the node link to filter the path visualization and show only the agents which are affected. You can then hover over an agent and click Show on timeline to filter the timeline to the specific agent: this is useful for investigating any reductions in availability that may occur before and after a total service outage.
Internet Insights can also be used to provide broader context to the precision-targeted Cloud and Enterprise Agent test results. When you are viewing a Cloud and Enterprise Agent test to investigate an issue, like a drop in application availability or an increase in network packet loss, you can quickly determine if the issue is affecting “just you" by looking for the purple swimlane below the timeline. In each test round that the purple swimlane is shown, it indicates that an outage has been detected which is affecting both this test and other tests from other ThousandEyes organizations.
The screenshot below shows an HTTP Server view, with the Availability metric selected on the timeline. In the middle of the timeline you can see a drop in server availability, and below the timeline, six purple bars are shown, indicating there was a broader outage affecting this test for six rounds.
To navigate from a Cloud and Enterprise Agent test view to the corresponding Internet Insights view, select a test round on the timeline which has a purple bar in the swimlane, then click Path Visualization in the Views menu to the left of the timeline.
In the Path Visualization panel, look for the Outage Detected button. Clicking on the button anywhere except the [#] nodes link will display the outage details pop-up dialog.
In the outage details pop-up dialog, click Internet Insights Views to open a new browser tab or window. The Internet Insights view will automatically be set to the appropriate outage type and filtered by the test from which the view was opened, as shown in the screenshot below.
You can set up custom dashboards and alert rules for Internet Insights based on affected catalog providers or applications, including your own affected tests. The following two sections briefly describe how to configure alert rules and dashboard widgets using Internet Insights data. For more information, see the Using Alerts and Dashboards with Internet Insights article.
To configure Internet Insights alert rules, navigate to the Alerts > Alert Rules screen, click the Internet Insights tab, and click Add New Alert Rule. Configure your settings and alert conditions in the dialog that opens, and click Create New Alert Rule. See the Setting Up Alert Rules for Internet Insights article for more information on Internet Insights alert rule settings and conditions.
To view the Internet Insights includes a built-in dashboard, navigate to Dashboards and select Internet Insights Built-in from the Dashboard: dropdown selector. For more information on the built-in dashboard, see the Using the Internet Insights Built-In Dashboard article.
You can also select Internet Insights as the data source for Data Summary widgets, Time Series widgets, and Map widgets on your custom dashboards. First, select your custom dashboard from the Dashboard: dropdown selector, or create a new custom dashboard by clicking the Options button and clicking Create New Dashboard.
With your custom dashboard open, click + Add Widget at the top of the page and choose a widget in any of the Time Series, Data Summary, or Map categories. Select Internet Insights for the Data Source, and proceed to configure the widget based on your specific needs.