Monitoring Slack

Slack is a platform for teams to communicate in real time. If your organization uses Slack, you most likely rely on it for collaboration and productivity. If it’s a critical tool for your organization, you’ll want to monitor its availability and performance for your workforce.

This guide will walk you through how to monitor your workforce’s Slack digital experience. To accomplish this, you'll use multiple features within ThousandEyes, including:

  • ThousandEyes Cloud and Enterprise Agents

  • ThousandEyes Endpoint Agents

  • ThousandEyes Internet Insights

This best-practices guide is intended for readers who already have some proficiency in ThousandEyes and are ready to delve into more advanced guidance. It assumes a basic understanding of networking concepts and how Slack is used for collaboration, as well as familiarity with the ThousandEyes platform.

If you are new to ThousandEyes, we recommend starting with the getting started guides to establish a solid foundation. If you are unsure, see the Audience Prerequisites section below for more details about the assumed knowledge in this article.

Audience Prerequisites

To effectively follow this guide, you should be:

  • Familiar with ThousandEyes’ role-based access control settings to ensure your user account has the necessary permissions.

  • Able to deploy Enterprise Agents and/or Endpoint Agents if they’re not already deployed.

  • Aware of your organization’s available licenses for Endpoint Agents and Internet Insights, and available units for Cloud and Enterprise Agents. For information on your organization’s usage and capacity, see the articles in the Usage-Based Billing section.

  • Comfortable with networking concepts, such as:

    • TCP/IP, HTTP, DNS, and how they relate to the ThousandEyes platform.

    • Content Delivery Networks (CDNs).

    • The difference between DNS resolvers (recursive DNS servers) and authoritative DNS nameservers. Learn more in the ThousandEyes Learning Center.

    • Basic knowledge of APIs and their uses.

  • Proficient in Slack concepts and connectivity principles.

Slack Architecture

The Slack architecture includes a client-side app and server-side infrastructure. The Slack client is software installed on clients’ computers or devices to connect to the Slack servers through the edge network.

The edge network consists of a set of globally distributed edge regions or AWS datacenters that are called edge PoPs. These edge PoPs sit closer to the users to reduce latency, improve performance, and connect them back to Slack’s main region located in AWS us-east-1. This is the region where storage and core services live, and Slack is heavily invested in using availability zone (AZ) resilience to overcome any issues on a given datacenter's AZ.

When a user launches the Slack client, it will make many DNS requests in the background to resolve slack.com, wss-primary.slack.com, slack-edge.com, and any other domains that are being used for Slack. If the DNS request is not resolved from the local cache, it will be resolved through the configured DNS recursive resolver, and ultimately answered by the authoritative name server. All this is done even before the client finishes loading the Slack client and its associated services.

Slack uses Amazon Route53 (R53) as the authoritative name server for most domains. Additionally, all WebSocket records are sub delegated to NS1 for WebSocket traffic. Both NS1 and Route 53 route requests to the region closest to the user. The DNS lookup response is a list of IP addresses for the region closest to the user. The client will then pick a random IP address from the list for Slack application connections. These IP addresses are public-facing network load balancers (NLBs) which front both the WebSocket and non-WebSocket stacks and are configured to be used in passthrough mode (Layer 4 load balancer); they forward the network packets to either WebSocket or non-WebSocket load-balancing stacks (Layer 7) at the edge. The client then initiates a HTTPS connection with Slack servers, which upgrades the connection stream to a WebSocket connection.

At this point, the user is ready to send a message on Slack. Requests to Slack can be broadly classified into two categories: WebSocket and non-WebSocket.

WebSocket Traffic

Slack uses WebSocket connections for sending and receiving messages. These WebSocket connections are ingested into a system called envoy-wss (WebSocket Service) and are accessible from the internet using the wss-primary.slack.com and wss-backup.slack.com DNS domains. NS1 will point the request to the NLB(s) in the region closest to the customer (based on the client subnet information). The NLB will forward the requests to the servers handling Slack WebSocket traffic, which complete the SSL/TLS handshake and then maintain WebSocket connections at Slack to enable real-time messaging.

Edge API (Non-WebSocket) Traffic

All non-WebSocket traffic — for example, Slack API, slackbot, webhooks, and third-party apps — flows through the edge API stack, or envoy-edge. These are the set of Envoy servers dedicated to API traffic.

Traffic is ingested into this stack via various DNS domains; for example, app.slack.com (API traffic destined for the webapp), edgeapi.slack.com (Slack’s user and channel information), files.slack.com (file upload and download), and slack-imgs.com (Slack image unfurling). Based on the domains that the requests came from, Route53 or NS1 will point the request to the NLBs in a region closest to the user, in terms of latency.

CDN

A content delivery network (CDN) is a group of distributed servers that caches content near end users, providing improved performance and lower latency. Slack uses AWS CloudFront as its primary CDN. All the static assets that are critical for booting the Slack client are served through the a.slack-edge.com domain via CloudFront. Additionally, there is a backup CDN domain accessible via b.slack-edge.com.

Slack Huddle

Slack Huddles are lightweight audio calls that let you and members of your team talk and share their desktop screen with each other in real time. When users join the huddle, https://yourworkspace.slack.com/api/rooms.join is called. The Huddle connection request requires a session token which passed as a cookie. The requirement of this token limits the ability to monitor the Huddle performance.

Monitoring Slack with Cloud and Enterprise Agents

You can monitor Slack services from within your own secure networks using ThousandEyes Enterprise Agents, and from outside of your own networks using ThousandEyes Cloud Agents. This section includes

Agent Placement and Selection

Enterprise Agents

Enterprise Agents are used to monitor DNS resolution (either internal resolvers or public resolvers), and to monitor Slack application performance. Use ThousandEyes Enterprise Agents to proactively monitor your Slack user experience from vantage points within your enterprise WAN. This is known as “inside-out” testing.

In the default template configuration, you will select Enterprise Agents that will run the tests to monitor Slack.

If you have not yet installed any Enterprise Agents, see the Installing section of the Enterprise Agent documentation.

  • At a minimum, place an Enterprise Agent at each internet egress point.

  • For hub-and-spoke network architectures with centralized egress, the recommended best practice is to also test from Enterprise Agents at each “spoke” user location. Spokes are typically branch offices.

Cloud Agents

Use ThousandEyes Cloud Agents to establish a baseline for web application performance outside of the enterprise network – the “outside-in” view.

It is recommended that you select at least one Cloud Agent for each region where you have internet access and an Enterprise Agent. Additionally, Cloud Agents can be added based on where you have a concentration of users connecting to Slack. These Cloud Agents provide an external baseline.

Make sure your environment meets the network requirements for connection to Slack.

Slack Domains to Monitor

  • [tenant name].enterprise.slack.com

  • wss-primary.slack.com or wss-backup.slack.com (for WebSocket connections)

  • a.slack-edge.com or b.slack-edge.com (Slack’s CDN CloudFront)

  • app.slack.com (API traffic that is destined for the Slack web application)

  • edgeapi.slack.com (Slack’s user and channel information)

  • files.slack.com (for file upload and download)

  • slack-imgs.com (Slack image unfurling)

  • slack.com/api/api.test (for testing calls to the Slack API)

Slack Monitoring Template

ThousandEyes provides a template specifically designed for monitoring Slack. It includes Cloud and Enterprise Agent tests for comprehensive monitoring of critical Slack services. This template includes five HTTP server tests and one DNS server test. The sections below outline each of these tests, as well as optional additional tests that can be configured.

Monitoring Your Slack Instance

These two tests require your Slack workspace URL. Every Slack Enterprise instance uses a Slack workspace URL, a unique public web address that allows access from the internet and enables communication and collaboration among team members. Monitoring your workspace URL from your member’s workplace is critical to ensuring their user experiences. To find your Slack Workspace URL, navigate to your profile settings in Slack and locate the ‘Workspace URL’ section.

HTTP server tests run from both ThousandEyes Enterprise Agents and Cloud Agents. HTTP server tests are used to monitor the performance, reachability, and availability of your Slack Enterprise instance. When an HTTP server test detects an issue, the results pinpoint the phase of a request in which the issue occurred, helping to decrease your mean time to repair. To assist the analysis, information from lower layers is included in this test type - agent-to-server, network, and BGP routing data is readily available when looking for the root cause.

DNS server tests run from ThousandEyes Enterprise Agents (“internal”). These tests monitor network connectivity to your DNS recursive resolvers (public or internal) and DNS resolution performance. Monitoring DNS in addition to HTTP for application availability and performance is crucial because DNS is critical for translating domain names to IP addresses. Any issues or delays in DNS resolution can significantly impact the accessibility and performance of a web application, even if the HTTP service itself is functioning properly.

Monitoring the WebSocket Connection to Slack

As the Slack Architecture section states, Slack uses a WebSocket connection for sending and receiving the message, which is ingested to the WebSocket service using the domain names wss-primary.slack.com and wss-backup.slack.com. An HTTP test monitoring these targets from ThousandEyes Enterprise Agents (“internal”) will ensure Slack messaging is delivered smoothly.

As shown in the following image, the successful server response from this target includes “worked” text, thus we use it in the HTTP option “Verify Content” to add the extra validation. In case this response isn’t received, the HTTP server test will fail with the error.

Testing the backup WebSocket is optional and is not included in the template.

Monitoring Non-WebSocket Connections to Slack

As the Slack Architecture section states, other traffic to the Slack web app is served by the Edge API non-WebSocket connection. This API Traffic is ingested into the Edge API stack via various DNS domains, depending on the services the client accesses. In the template, we recommend the two primary domains app.slack.com and edgeapi.slack.com as the HTTP server test targets. Other common Slack domains, such as files.slack.com and slack-imgs.com, are listed as the optional test target.

| Test Layer | Web | | Test Type | HTTP server | | Default Test Name | template name - tenant name - User Info | | Test Description | Slack’s user and channel information | | Test Target (URL) | https://edgeapi.slack.com | | Interval | 1, 2, or 5 minutes | | Agents | Enterprise Agents. Cloud Agents for baseline |

The following tests are optional and not included in this template. Depending on your Slack use case, we suggest monitoring them separately.

Monitoring the CDN/CloudFront

CDN performance tests conducted using ThousandEyes Cloud Agents provide deep insights into CDN content delivery, including geographic CDN load balancing, latency, and availability. ThousandEyes monitors the performance of your application, origin, and CDN edge node, so you can quickly pinpoint and diagnose issues that impact a user’s digital experience. You can gain insight into how CDNs will affect application delivery in real-time and perform CDN comparisons across regions and over time with page load, transaction, and HTTP tests that include rich network measurements, Layer 3 path visualization, and BGP route visualizations.

The cached static data in the CDN can be found by the specific path after the domain. Because of this, the HTTP server test to the main domain is expected to receive a 404 error as there is no content. Thus, this test specifies the Desired status code as 404.

Slack Application-Layer Monitoring Template

If you deploy this template, you do not need to deploy the Slack Monitoring Template.

This template is suitable for monitoring the Slack Application layer, including HTTP server tests from the Slack Monitoring Template, as well as one Page Load test and one API test.

These browser synthetic tests require Enterprise Agents using Chromium. Chromium is included as part of the BrowserBot package. You can find more information in: What is BrowserBot?.

Monitoring Your Slack Instance

This test requires your Slack workspace URL. Every Slack Enterprise instance uses a Slack workspace URL, a unique public web address that allows access from the internet and enables communication and collaboration among team members. Monitoring your workspace URL from your member’s workplace is critical to ensuring their user experiences. To find your Slack Workspace URL, navigate to your profile settings in Slack and locate the ‘Workspace URL’ section.

Page load tests run from both ThousandEyes Enterprise Agents and Cloud Agents. They measures the performance of an individual web page. This test uses a Chromium-based browser running on a Cisco ThousandEyes Cloud or Enterprise Agent. In addition to showing total page load time, the test shows the load times for each of the Document Object Model (DOM) components on the page.

Monitoring the Slack API

For users leveraging Slack to create modular functions and build an automated workflow (see https://api.slack.com), an API test is essential to validate the availability of the Slack API.

API tests can reveal the application state and common application failure patterns of the critical web API endpoints within your application ecosystem, as seen from the vantage point of a ThousandEyes Cloud or Enterprise Agent. You can use the ‘Step Builder” to configure a single test that makes multiple API calls to different endpoints and passes results as variables from one call to the next. They support any HTTP endpoint for sending a request, receiving a response, and capturing the timing, as well as sending the pre-defined variables, and assertion. In this test, we use assertion rule “Status code: 200” so that any other HTTP responses status will trigger an error.

The following test is optional, and not included in this template. If you would like to monitor a specific Slack API endpoint, obtain a Bearer authentication token by creating a new app via https://api.slack.com/apps, then configure the Bearer token in the Authentication tab.

Using the Slack Template

Now that we've discussed all the tests, agents, and targets in the ThousandEyes Slack template, we are ready to use this template to deploy our tests.

Multiple permissions are required in order to view and deploy templates. For the complete list, see the Template Prerequisites.

To begin deploying a template:

  1. Click the dropdown next to the Add New Test button.

  2. In the dropdown menu, select Add From Template.

    The Deploy Template panel opens.

  3. In the Search field, enter Slack.

    Alternatively, you can use the Collections filter and then select Slack.

    There are two Slack-related certified templates:

    • Slack

    • Slack Application Layer

    To choose which template you should use, consider your agent selection:

    • If you want to use any Enterprise Agents that use Browserbot, choose the Slack Application Layer monitoring template.

    • Otherwise, choose the Slack monitoring template.

    For more information about BrowserBot, see What Is BrowserBot?.

  4. Select your template and proceed with configuring the tests.

For the template fields:

  • Enter your Slack Tenant Name (the example image uses "thousandeyes").

  • Select the Cloud and Enterprise Agents to use for the test.

  • Select a testing interval (ThousandEyes recommends either 1 or 2 minutes, and no greater than 5 minutes).

  • Provide a name for the test suite to easily identify tests using this template. This name is arbitrary, but note that:

    • The name will be used as the prefix for all test names, and longer ones will be harder to distinguish in dashboards/test views.

    • A label will be created with this name, and applied to all the tests in the template.

Note on Entering DNS Resolvers

By default, the template will automatically look up the authoritative DNS servers for the Slack tenant. As described above, the DNS server tests are intended to monitor your DNS resolvers, not the authoritative nameservers (DNS trace tests are used for that instead). 

To enter your DNS resolvers, first click the “x” at the right side of the DNS resolvers input field to clear the servers that were automatically identified. 

Next, place your cursor in the text input field and type the IP address or hostname of your DNS resolver, and press your enter key. Repeat for each of your DNS resolvers. 

  1. If there are any tests in the template that you want to disable before you deploy the template, use the Disable toggle next to the test name.

  2. Click Review to see what you are about to deploy.

    The dialog moves forward to Step 3 of 3 - Review template, a summary of the tests, dashboards, and labels included in the deployment.

  3. Review the summary.

  4. Click Deploy Now to deploy the monitoring template.

    The deployment process may take a few minutes to complete. When it has finished, the dialog shows a success message.

  5. In the success message, decide your next step:

  • To go to Cloud & Enterprise Agents > Test Settings, to show a filtered view of tests you have just deployed, click Go to Test Settings.

  • To go to the dashboard you just created through the template, click Go to Dashboards.

    It may take a few moments for the tests to run and to gather results.

The Go to Dashboards link opens the Slack Health Overview dashboard, covered in the next section. It will take a little while for the tests to run and gather results before the dashboard will display the metrics from the deployed tests.

Slack Health Overview Dashboard

This section describes the dashboard that is included in the Slack template. The dashboard is designed with the highest-level information shown at the top, with increasing service granularity in the widgets as you navigate down the page.   

All of the widgets in these dashboards allow you to drill down into the individual tests for complete details. Click on the widget, in this example Latency, to open the drilldown dialog, then select the test or tests to view, and click Open in Views. See Troubleshooting with Dashboard Drill Down for more information.

The dashboard provides a service-oriented health overview for Slack. The widgets in this dashboard are primarily grouped by service, highlighting issues that affect one or more of the individual Slack services.

The first three rows of the service-oriented dashboard show Web Health and Network Health from the tests deployed by the Slack template. For website health, we show two widgets with 90th percentile measure and mean measure to give you an overall health status of Slack.

The middle section of the dashboard is intended to further break down the metrics and present the per-test, per-agent performance in multi graphs to show which services are experiencing the issues.

The bottom widgets of the service-oriented dashboard display each of your Slack tests, along with their current alert status, the most recent test measurements, and the trends of those measurements over the last 12 hours. You can click on any of the tests in this list to open the test view. The test widgets are filtered based on internal versus external tests to provide a quick visual of the service health and baseline.

Monitoring Slack with Endpoint Agents

Endpoint Agent tests are not included in the default dashboard, but can be easily added based on your requirements.

In addition to the Cloud and Enterprise Agent templates described above, you can use ThousandEyes Endpoint Agents for real-user monitoring of Slack, combined with scheduled synthetic testing for proactive monitoring. ThousandEyes recommends the scheduled test described below.

Scheduled Tests

Scheduled tests run from Endpoint Agents at regularly scheduled intervals without any user interaction and provide a great baseline for troubleshooting.

Real-User Tests

Real-user tests run from Endpoint Agents are automatically deployed when the user visits a website in the monitored domain set. Real-user tests are not included in the Slack templates, and require manual configuration to set up.

Domains to include in the monitored domain set:

  • [tenant name].enterprise.slack.com

  • wss-primary.slack.com (for WebSocket connections)

  • a.slack-edge.com (Slack’s CDN CloudFront) or b.slack-edge.com

  • app.slack.com (API traffic that is destined for the Slack web application)

  • edgeapi.slack.com (Slack’s user and channel information)

  • slack.com

Internet Insights

Internet Insights is not included in the default dashboard, but can be easily added based on your requirements.

When a critical service is disrupted, it is common to wonder if you are the only one affected by the outage or if the issue is larger in scope or scale. Internet Insights collects data from a diverse set of vantage points across the globe to offer visibility into service providers, including AWS (Amazon Web Services) and Slack. Internet Insights is built upon ThousandEyes’ collective data set - billions of probes across the Internet to websites, apps, and API endpoints every day -- combined with outage detection to provide a macro-scale view into network and application outages. The intelligence derived from this data enables operations teams to quickly identify and resolve issues with providers using concrete Internet telemetry data.

ThousandEyes recommends selecting the following packages in your Internet Insights Catalog Settings configuration to enable you to understand if there is a larger outage causing a Slack service disruption. The UCAAS package directly aligns with the Zoom application services and will help isolate large outages with Slack. The Slack IAAS package will help you clearly isolate larger issues for the Slack data centers hosted in AWS. Lastly the ISP package will help show larger ISP outages that are causing Slack-related network access issues.

  • A SAAS package in each applicable region, which includes:

    • Amazon

  • A UCAAS package in each applicable region, which includes:

    • Slack

  • An IAAS package in each applicable region, which includes:

    • Amazon Web Services

Last updated