Platform

AI

AI Agents
Sense, decide, and act faster than ever before
AI Visibility
See how your brand shows up in AI search
AI Feedback
Distill what your customers say they want
Amplitude MCP
Insights from the comfort of your favorite AI tool

Insights

Product Analytics
Understand the full user journey
Marketing Analytics
Get the metrics you need with one line of code
Session Replay
Visualize sessions based on events in your product
Heatmaps
Visualize clicks, scrolls, and engagement

Action

Guides and Surveys
Guide your users and collect feedback
Feature Experimentation
Innovate with personalized product experiences
Web Experimentation
Drive conversion with A/B testing powered by data
Feature Management
Build fast, target easily, and learn as you ship
Activation
Unite data across teams

Data

Warehouse-native Amplitude
Unlock insights from your data warehouse
Data Governance
Complete data you can trust
Security & Privacy
Keep your data secure and compliant
Integrations
Connect Amplitude to hundreds of partners
Solutions
Solutions that drive business results
Deliver customer value and drive business outcomes
Amplitude Solutions →

Industry

Financial Services
Personalize the banking experience
B2B
Maximize product adoption
Media
Identify impactful content
Healthcare
Simplify the digital healthcare experience
Ecommerce
Optimize for transactions

Use Case

Acquisition
Get users hooked from day one
Retention
Understand your customers like no one else
Monetization
Turn behavior into business

Team

Product
Fuel faster growth
Data
Make trusted data accessible
Engineering
Ship faster, learn more
Marketing
Build customers for life
Executive
Power decisions, shape the future

Size

Startups
Free analytics tools for startups
Enterprise
Advanced analytics for scaling businesses
Resources

Learn

Blog
Thought leadership from industry experts
Resource Library
Expertise to guide your growth
Compare
See how we stack up against the competition
Glossary
Learn about analytics, product, and technical terms
Explore Hub
Detailed guides on product and web analytics

Connect

Community
Connect with peers in product analytics
Events
Register for live or virtual events
Customers
Discover why customers love Amplitude
Partners
Accelerate business value through our ecosystem

Support & Services

Customer Help Center
All support resources in one place: policies, customer portal, and request forms
Developer Hub
Integrate and instrument Amplitude
Academy & Training
Become an Amplitude pro
Professional Services
Drive business success with expert guidance and support
Product Updates
See what's new from Amplitude

Tools

Benchmarks
Understand how your product compares
Templates
Kickstart your analysis with custom dashboard templates
Tracking Guides
Learn how to track events and metrics with Amplitude
Maturity Model
Learn more about our digital experience maturity model
Pricing
LoginContact salesGet started

AI

AI AgentsAI VisibilityAI FeedbackAmplitude MCP

Insights

Product AnalyticsMarketing AnalyticsSession ReplayHeatmaps

Action

Guides and SurveysFeature ExperimentationWeb ExperimentationFeature ManagementActivation

Data

Warehouse-native AmplitudeData GovernanceSecurity & PrivacyIntegrations
Amplitude Solutions →

Industry

Financial ServicesB2BMediaHealthcareEcommerce

Use Case

AcquisitionRetentionMonetization

Team

ProductDataEngineeringMarketingExecutive

Size

StartupsEnterprise

Learn

BlogResource LibraryCompareGlossaryExplore Hub

Connect

CommunityEventsCustomersPartners

Support & Services

Customer Help CenterDeveloper HubAcademy & TrainingProfessional ServicesProduct Updates

Tools

BenchmarksTemplatesTracking GuidesMaturity Model
LoginSign Up

Making Diagnostic Analytics Trustworthy

Customers won’t take your word for it. Diagnostic AI needs to prove its accuracy.
Insights

Dec 18, 2025

7 min read

Henry Arbolaez

Henry Arbolaez

Senior Software Engineer

Diagnostic analytics feature

Descriptive analytics shows you what happened (e.g., conversion dropped 15%). Diagnostic analytics explains why (e.g., because Safari mobile users hit a form bug introduced in a recent release).

Most analytics tools today are descriptive. We built Amplitude’s automated insights to be diagnostic because we know that teams need clearer understandings of cause and effect. It reads experiments, releases, annotations, and segments, and uses all that context to create a hypothesis about likely explanations.

Descriptive analytics will be a huge step forward for customers who want more value out of their data analysis. But we can’t simply tell our customers that we trained AI to uncover root causes and expect them to believe us. We know we have to repeatedly show them our AI is accurate to earn their trust.

Why trust matters

We all know trust matters. If Amplitude’s output is incorrect, teams will end up making the wrong decision. It’s lose-lose. Our customers would take a step back, and we’d lose their trust. The only way to give customers value (and gain their trust) with automated insights was to make it accurate. Before we shipped anything, we knew we had to define accuracy and measure it in a way that credibly mapped to how people use insights every day.

We quickly found that diagnostic AI doesn’t need to be perfect to provide value. It simply needs to be consistently helpful. That finding became our guiding principle.

How we measure insight accuracy

We set up our automated insights capability to evaluate outputs based on two criteria:

  • Real cases: Is the case an existing example that human analysts have already analyzed to find the correct root cause?
  • A separate judging model: Did the AI discover the correct explanation? This is independent from the system itself and performs similarly to human reviewers.

Then we tracked recall, precision, and insightfulness (e.g., whether the system produced at least one correct insight). Interestingly, we found that insightfulness was the most meaningful measure.

Analysts rarely need a perfect, fully polished narrative. With a clear starting point, they can get an answer fast. Once the system could produce a correct insight 80% of the time, we knew it could dramatically reduce the time analysts spent investigating issues. That level of accuracy was enough for us to move forward.

Using confidence levels for partial insights

Some problems are more nuanced and don’t have a single definitive explanation. Bot activity is a great example of this. You can often identify bot-like patterns, but quantifying their exact impact is nearly impossible.

Instead of pretending to know more than it does, we designed our AI to report levels of confidence. It might flag bot traffic as a likely factor without overstating precision. Customers consistently tell us that even partial insights help them work faster. A hint that points in the right direction often unlocks the next step. Even disproving a hypothesis is valuable because it narrows the investigation.

Transparency about uncertainty turns our automated insights capability into a collaborator rather than a black box for teams.

Transparency builds trust

Analysts trust insights more when they can see the underlying logic. Our AI exposes live reasoning so users can watch the system work in real time, including which tools it calls and what information it checks. It also surfaces inline citations, linking all of its assumptions and findings directly to the sources it used to arrive at that conclusion.

We have found that most people verify everything closely the first few times. Once they verify that the results hold up, they become less skeptical.

What we learned from failure

Since we built evals into our development loop, we were able to clearly see recurring areas of improvement: missing tools, missing context, misordered steps, too much data in the context window, prompts that lacked sufficient guidance, etc.

Each issue pointed directly to the fix. Missing bot detection? We needed to build a tool for it. Missing release context? We needed to pull it into the workflow. Funnel root cause hidden between steps? We needed to create micro-funnel analysis.

Evals and this tight feedback loop let us continually improve the system in ways that aligned closer to analyst workflows, not guesses or hypothetical methods.

When there is more than one valid explanation

Real-world data is often ambiguous. We wanted our model to account for that. As a result, instead of only offering a single answer, our automated insights capability can present multiple plausible explanations and let the analyst decide which is best.

This ensures a collaborative partnership between the analyst and the AI. Together, they can decide which hypotheses to explore. This setup makes the system more realistic because, in practice, teams often weigh several hypotheses before landing on the right one.

Descriptive → diagnostic → predictive

The evolution of Amplitude’s AI mirrors how LLMs have changed. We started with descriptive analytics that allowed our users to leverage Ask Amplitude to translate natural language into charts. Our automated insights capability allows our users to perform diagnostic analytics to quickly understand why metrics change. The next natural frontier is predictive analytics, which will allow everyone to understand what will likely happen next.

Predictive analytics requires strong diagnostic tools. It’s impossible to forecast the future effectively if you do not understand the forces behind past changes. We feel confident that the diagnostic foundation we are building today will power the predictive tools that come next.

It starts and ends with trust

Making diagnostic analytics trustworthy is not about making AI sound smarter. It’s about giving people insights they can rely on. Our AI will earn trust by showing its work, expressing uncertainty honestly, learning from failures, and anchoring its reasoning in patterns that mirror how analysts think.

These same principles apply to anyone evaluating or building AI systems designed to explain, recommend, or diagnose. Trust is not something you bolt on at the end. It’s something you earn through design choices that prioritize clarity and transparency.

Looking to find out more about Amplitude’s AI innovations? Visit us here:

About the author
Henry Arbolaez

Henry Arbolaez

Senior Software Engineer

More from Henry

Henry Arbolaez is a Senior Software Engineer at Amplitude working on AI-powered products. He enjoys building from zero to one, loves good coffee, and is always looking for the next place to travel.

More from Henry
Topics

AI

Amplitude Analytics

Analytics

Recommended Reading

article card image
Read 
Product
Web Vitals in Amplitude: Understand and Optimize Web Performance

Dec 18, 2025

5 min read

article card image
Read 
Insights
The Product Benchmarks Every Retail and Ecommerce Company Should Know

Dec 18, 2025

5 min read

article card image
Read 
Insights
Why Context Engineering Matters More Than Prompt Engineering

Dec 16, 2025

9 min read

article card image
Read 
Insights
The Product Benchmarks Every Financial Services Company Should Know

Dec 16, 2025

5 min read

Platform
  • Product Analytics
  • Feature Experimentation
  • Feature Management
  • Web Analytics
  • Web Experimentation
  • Session Replay
  • Activation
  • Guides and Surveys
  • AI Agents
  • AI Visibility
  • AI Feedback
  • Amplitude MCP
Compare us
  • Adobe
  • Google Analytics
  • Mixpanel
  • Heap
  • Optimizely
  • Fullstory
  • Pendo
Resources
  • Resource Library
  • Blog
  • Product Updates
  • Amp Champs
  • Amplitude Academy
  • Events
  • Glossary
Partners & Support
  • Contact Us
  • Customer Help Center
  • Community
  • Developer Docs
  • Find a Partner
  • Become an affiliate
Company
  • About Us
  • Careers
  • Press & News
  • Investor Relations
  • Diversity, Equity & Inclusion
Terms of ServicePrivacy NoticeAcceptable Use PolicyLegal
EnglishJapanese (日本語)Korean (한국어)Español (Spain)Português (Brasil)Português (Portugal)FrançaisDeutsch
© 2025 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.
Blog
InsightsProductCompanyCustomers
Topics

101

AI

APJ

Acquisition

Adobe Analytics

Amplify

Amplitude Academy

Amplitude Activation

Amplitude Analytics

Amplitude Audiences

Amplitude Community

Amplitude Feature Experimentation

Amplitude Guides and Surveys

Amplitude Heatmaps

Amplitude Made Easy

Amplitude Session Replay

Amplitude Web Experimentation

Amplitude on Amplitude

Analytics

B2B SaaS

Behavioral Analytics

Benchmarks

Churn Analysis

Cohort Analysis

Collaboration

Consolidation

Conversion

Customer Experience

Customer Lifetime Value

DEI

Data

Data Governance

Data Management

Data Tables

Digital Experience Maturity

Digital Native

Digital Transformer

EMEA

Ecommerce

Employee Resource Group

Engagement

Event Tracking

Experimentation

Feature Adoption

Financial Services

Funnel Analysis

Getting Started

Google Analytics

Growth

Healthcare

How I Amplitude

Implementation

Integration

LATAM

Life at Amplitude

MCP

Machine Learning

Marketing Analytics

Media and Entertainment

Metrics

Modern Data Series

Monetization

Next Gen Builders

North Star Metric

Partnerships

Personalization

Pioneer Awards

Privacy

Product 50

Product Analytics

Product Design

Product Management

Product Releases

Product Strategy

Product-Led Growth

Recap

Retention

Startup

Tech Stack

The Ampys

Warehouse-native Amplitude