Tracking Conversations:
Measuring Content and Identity
Exposure on AI Chatbots
What We Studied & What We Found
AI chatbots have become a primary interface for seeking information online. As their popularity grows, providers increasingly deploy advertising and analytics infrastructure - raising questions about what happens to the sensitive conversations users type into these interfaces.
We present the first systematic measurement study of web tracking on 20 popular AI chatbots. Using controlled test accounts and a deliberately sensitive prompt ("pregnancy test near me"), we captured and analyzed HTTP network traffic to measure two categories of exposure: content (user prompts, prompt-derived titles, chat URLs, chat identifiers) and identity (names, email addresses, account identifiers, first-party cookies, IP addresses, User-Agent strings).
We find that 17 of 20 chatbots share information with at least one third party. Three services - Genspark, SeaArt, and ChatOn - transmit the full plaintext conversation text to Microsoft Clarity, a session replay service, during normal authenticated sessions. Microsoft's Copilot also embeds Clarity. While the conversation is tracked, it is not recorded in plaintext. Fifteen chatbots expose conversation URLs or identifiers to third-party advertising and analytics endpoints. Several expose user identity through support widgets, analytics tags, and error monitoring - including hashed email addresses typically used for cross-site tracking and targeted advertising.
We also evaluate private and temporary chat modes, finding they dramatically reduce tracking: from 178 observed (chatbot, third-party) pairs in normal sessions to just 13 in private mode, with zero content or identity exposure detected across all 10 services tested. Finally, we analyze privacy policy disclosures and find notable gaps between what services actually send and what they disclose.
Exposure Tracking Matrix
Which chatbots expose which types of information, and to whom. Click any row label to expand notes. Hover cells for per-service detail.
U URL
B Body
H Header
C Cookie
★ supports private mode
Third-Party Data Flow Visualization
Flow diagram mapping which chatbots send data to which third-party services. Toggle between normal and private sessions to see the stark difference.
Hover over flows and nodes for details.
Case Studies
Three categories of serious data exposure discovered during the study. Expand each to read the full technical detail and view captured network payloads.
Private Mode Effectiveness
For the 10 services offering a private or temporary chat mode, tracking collapses almost entirely. No content or identity exposure was detected in any private session.
Private / Temporary Chat Support
Starred services were evaluated in both normal and private modes.
Privacy Policy Analysis
All 20 privacy policies acknowledge general data practices, but most stop there. Only 8 name specific third-party recipients - and three services have a critical gap.
| Chatbot | Names Specific Recipients | Notable Gap |
|---|
Study Methodology
A controlled measurement study using fresh test accounts, a single sensitive prompt, and comprehensive traffic analysis with 12+ encoding variants to catch hashed identifiers.
Chatbot Selection
20 popular AI chatbots spanning major US and international providers - closed-weight (ChatGPT, Claude, Gemini) and open-weight deployments (DeepSeek, Qwen), consumer and developer-oriented.
Prompt Design
"pregnancy test near me" was chosen for combining a sensitive health topic with an implicit location - a high-stakes query category where privacy exposure carries real-world consequences.
Encoding & Hash Coverage
Identity strings searched across 12+ encoding and hash variants: base64, URL-encoding, hex, MD5, SHA-1/256/512, SHA3, RIPEMD-160, CRC-32, Adler-32 - to catch hashed email identifiers.
Party Attribution
eTLD+1 matching classifies domains. Platform parties (e.g. Google for Gemini) are distinguished from independent third parties. Categories: Advertising, Analytics, Other.
Private Mode Evaluation
10 services support private or temporary chat. Each was evaluated identically to normal sessions and results compared to measure tracking reduction effectiveness.
Scope & Limitations
Web interfaces only - excludes mobile apps, extensions, and embedded chatbot deployments. Single controlled prompt. Chrome baseline without tracking protection. Measurement-only; does not evaluate user consent flows.
Cite This Work
Available on arXiv: arXiv:2604.27438 [cs.CR]