DecoverAI Blog - How to Collect and Produce Slack and Teams Messages for Discovery

Collaboration Tools Are Discovery's Newest Headache

Preserving Slack and Teams messages for litigation requires four steps: legal hold, platform export, processing, and formatted production. Legal teams must use Slack's Discovery API or Microsoft Purview to export raw data, then normalize it into a review-ready format that reconstructs threads and resolves user identities. DecoverAI automates the entire pipeline — from raw JSON export to Bates-stamped production — handling threading, metadata normalization, and AI-powered relevance classification across both platforms in a single workflow.

Unlike email, which produces discrete, self-contained messages with clear metadata, collaboration platform data is continuous and evolving. A Slack channel is not a document — it is an ongoing conversation that may span months or years, with messages edited, deleted, reactions added, files shared, and threads branching in multiple directions. The concept of a "document" that can be collected, reviewed, and produced does not map neatly onto this data.

The technical challenges are compounded by the different storage architectures of each platform. Slack stores messages, files, and metadata in its own cloud infrastructure, with different retention and export capabilities depending on the subscription plan (Free, Pro, Business+, or Enterprise Grid). Microsoft Teams is even more fragmented: chat messages are stored in Exchange Online, files shared in Teams are stored in SharePoint, personal files are in OneDrive, meeting recordings may be in OneDrive or Stream, and compliance data flows through Microsoft Purview.

For legal teams facing a discovery obligation that includes collaboration platform data, the learning curve is steep. The tools, techniques, and workflows developed for email and file share discovery do not translate directly. New skills are needed for collection, new processing capabilities are required for normalization, and new approaches must be developed for review and production. This guide provides a practical framework for each of these stages.

Step 1: Understand What You're Collecting

Before you can collect Slack or Teams data, you need to understand the different data types each platform generates. In Slack, the primary data types are: channel messages (public and private channels), direct messages (one-on-one and group DMs), threads (replies to channel messages), reactions (emoji responses), file uploads (files shared within channels or DMs), and app integrations (automated messages from bots and connected services). Each of these data types may be relevant to discovery and must be considered when scoping collection.

Microsoft Teams data types include: channel messages (standard and private channels), chat messages (one-on-one and group chats), meeting chat (messages sent during meetings), files (shared in channels via SharePoint or in chats via OneDrive), meeting recordings and transcripts, Whiteboard content, and Wiki pages within channels. The data is distributed across multiple Microsoft 365 services, which means collection may require accessing several different systems.

Export capabilities vary significantly by plan. Slack's Free plan retains only the most recent 90 days of message history and limits file storage. The Pro plan provides unlimited message history but does not include the Corporate Export or Discovery API features. Business+ and Enterprise Grid plans offer the Discovery API (for individual workspace exports with compliance data) and Corporate Export (for organization-wide exports). If your organization is on a lower-tier plan, you may need to upgrade before you can collect historical data.

Check retention settings before collection. Both Slack and Teams allow administrators to configure message retention policies that automatically delete messages after a specified period. If a retention policy has been deleting messages, data from before the retention window may be permanently lost. Check the retention configuration immediately upon receiving a discovery obligation, and suspend any automatic deletion policies to prevent ongoing spoliation. Document the retention settings as they existed at the time of the litigation hold, as this information may be relevant if the completeness of your collection is challenged.

Step 2: Export Using the Right Method

For Slack, the two primary export methods are the Discovery API and Corporate Export. The Discovery API is available on Business+ and Enterprise Grid plans and allows workspace owners to export data from specific channels, date ranges, and users. It provides message content, metadata, file links, and user information in JSON format. The Discovery API is the preferred method for targeted collections because it allows you to scope the export to specific custodians or channels.

Slack's Corporate Export is available on Enterprise Grid plans and provides an organization-wide export of all messages, files, and metadata across all workspaces. This is the nuclear option — it collects everything — and is typically used when the discovery scope is broad or when you need to ensure completeness across the entire organization. The export produces JSON files organized by channel and date, along with associated files and user metadata. Note that even with Corporate Export, direct messages require additional authorization and may not be included by default.

For Microsoft Teams, Microsoft Purview (formerly known as Microsoft 365 Compliance Center) is the primary collection tool. Purview's eDiscovery capabilities allow you to search for and collect Teams messages, channel conversations, files, and other Microsoft 365 content based on custodian, date range, keywords, and other criteria. Purview places content on hold to prevent deletion and allows you to export collected content in various formats for processing and review.

Document your collection methodology thoroughly. Record the export method used, the date and time of export, the scope parameters (custodians, channels, date ranges), any filters applied, and the identity of the person who performed the export. This documentation is essential for defensibility. If opposing counsel challenges the completeness of your collection, you need to be able to demonstrate that you used a reliable method, applied reasonable scope parameters, and captured all data within the defined scope. Courts are increasingly scrutinizing the collection of collaboration platform data, and a well-documented methodology is your best defense.

Step 3: Process and Normalize the Data

Raw Slack and Teams exports are not review-ready. Slack exports arrive as JSON files containing message data, user IDs (not names), channel IDs (not names), and Unix timestamps. Teams data exported from Purview may arrive in various formats depending on the export options selected. Both require significant processing before they can be reviewed in a meaningful way. The processing stage transforms raw export data into a format that reviewers can understand and work with.

Threading is one of the most critical processing steps. In Slack, replies to a message form a thread that is linked to the parent message via a thread timestamp. In the raw export, these replies appear as individual messages scattered throughout the export file. Processing must reconstruct these threads so that reviewers see the complete conversation in context. Without proper threading, a reply that says "I agree, let's proceed" is meaningless — you need to see what the author was agreeing to.

User resolution transforms cryptic user IDs into human-readable names. Slack exports reference users by their internal ID (e.g., "U03ABCDEF") rather than their display name or email address. The export includes a users.json file that maps IDs to names, but your processing tool must apply this mapping to every message. Similarly, channel IDs must be resolved to channel names, and file references must be linked to the actual file content. Reaction tracking maps emoji reactions to their authors and timestamps — reactions can be legally significant as evidence of awareness or agreement.

Timestamp normalization is essential when dealing with global organizations. Slack stores all timestamps in UTC, but reviewers need to see times in the relevant local timezone. Teams data may include timezone information or may not, depending on the export method. Normalize all timestamps to a consistent format and timezone, and document the normalization approach so that any time-based analysis can be verified. DecoverAI handles all of these processing steps automatically, transforming raw Slack and Teams exports into a unified, review-ready format with threading, user resolution, file association, and normalized timestamps.

Step 4: Review and Produce

Reviewing collaboration platform data requires a different approach than reviewing traditional email. The key principle is to preserve conversational context. A single Slack message taken out of context can be misleading or incomprehensible. Reviewers need to see messages within their thread, within their channel, and within the broader conversation flow. Review platforms that display messages in a threaded, chronological view — similar to how they appear in the native application — are far more effective than platforms that treat each message as a standalone document.

Negotiate the production format for collaboration platform data in your ESI protocol before you begin review. There is no universally accepted standard for producing Slack or Teams data. Some parties produce chat data as TIFF images of reconstructed conversations, similar to how email is produced. Others produce in native JSON format with a load file. Still others produce HTML renderings that preserve the visual appearance of the conversation. Each approach has trade-offs in terms of usability, searchability, and Bates stamping capability.

Bates stamping collaboration platform data raises unique challenges. In traditional document production, each page receives a unique Bates number. But what constitutes a "page" in a Slack channel with thousands of messages? Common approaches include: stamping each individual message (which can produce enormous Bates ranges), stamping each conversation thread as a unit, or stamping each channel export as a single document with page breaks at regular intervals. The approach should be agreed upon in the ESI protocol and applied consistently throughout the production.

The scale of collaboration platform data can be enormous. The Pointe case study demonstrates DecoverAI's ability to handle this scale: the platform processed over 1 million files from a 1 terabyte dataset that included collaboration platform data alongside traditional document types. The AI-powered classification system categorized documents by type, topic, and relevance, allowing reviewers to focus their limited time on the most important material. For collaboration platform data specifically, the platform's threading and context-preservation features ensured that reviewers could assess conversations in their natural flow rather than as disconnected individual messages.

How DecoverAI Handles Collaboration Platform Data

DecoverAI provides a unified review environment that brings Slack, Teams, email, and traditional document data together in a single platform. Instead of switching between different tools or review interfaces for different data types, reviewers work in one consistent environment where all data is searchable, taggable, and producible using the same workflow. This eliminates the fragmentation that plagues teams trying to review collaboration data alongside traditional documents.

The platform's AI classification capabilities extend to collaboration platform data. The system analyzes message content, participant patterns, channel names, and conversational context to classify messages by topic, relevance, and privilege status. This is particularly valuable for Slack and Teams data, where the sheer volume of casual conversation can bury the legally significant communications. AI classification surfaces the messages that matter and deprioritizes the noise — the lunch plans, the GIF reactions, the daily standup check-ins — that consumes reviewer time without producing responsive material.

DecoverAI's processing pipeline automatically handles the technical complexities of collaboration platform data: threading reconstruction, user and channel resolution, file association, reaction tracking, and timestamp normalization. Raw Slack JSON exports and Teams Purview exports are transformed into review-ready documents within hours of upload. The platform maintains all original metadata while presenting the data in a human-readable format that preserves conversational context.

Pricing is straightforward: $60 per gigabyte for processing and hosting, with no per-user licensing. This is a critical distinction from traditional eDiscovery platforms, which often charge per-user fees that make collaboration platform review prohibitively expensive. When a Slack workspace has hundreds of users and millions of messages, per-user pricing can easily exceed the cost of the underlying review. DecoverAI's volume-based pricing keeps costs predictable regardless of how many custodians or channels are involved in the collection.

How to Collect and Produce Slack and Teams Messages for Discovery