Skip to content

Amplitude Analytics Implementation Guide

Overview

This document explains the Amplitude analytics implementation for Archety. The system tracks user engagement, feature usage, LLM costs, and product metrics to provide actionable insights.

Implementation Date: November 6, 2025 Amplitude API Key: Configured via environment variable


What's Been Implemented

✅ Core Infrastructure

  • Analytics Service (app/analytics/amplitude_service.py)
  • Complete wrapper around Amplitude Python SDK
  • 40+ event tracking methods
  • Automatic property sanitization and privacy protection
  • Graceful error handling (never crashes app)
  • No-op mode when API key not configured

  • Tracking Helper (app/analytics/tracking_helper.py)

  • High-level convenience methods
  • Automatic user creation and identification
  • Context-aware tracking with minimal code changes

  • Cost Calculation (app/utils/llm_client.py)

  • Token cost estimation for all LLM models
  • Supports GPT-4o, GPT-4o-mini, GPT-5 series

✅ Events Currently Tracked

1. Acquisition & Onboarding

  • user_discovered - First message from unknown user
  • persona_selected - First interaction with Sage/Echo/Vex
  • phone_verification_completed - First bidirectional conversation (future)

Implementation: app/main.py (lines 337-348, 498-509)

2. Messaging & Engagement

  • message_sent_user - Every inbound message (iMessage + Telegram)
  • message_sent_companion - Every outbound message (ready, not wired)
  • conversation_started - Session start detection (ready, not wired)
  • conversation_ended - Session end detection (ready, not wired)

Implementation: - iMessage: app/main.py lines 337-348 - Telegram: app/main.py lines 498-509

3. Integrations

  • integration_connected - OAuth successful (Calendar/Gmail)
  • integration_disconnected - OAuth revoked

Implementation: app/oauth/routes.py (lines 247-282, 511-525)

4. User Properties

Automatically tracked: - signup_date - When user created - primary_channel - imessage | telegram - primary_persona - sage | echo | vex - total_messages_sent - Incremented automatically - total_messages_received - Incremented automatically - calendar_connected - Boolean - gmail_connected - Boolean

Implementation: app/analytics/tracking_helper.py::track_user_message()

⏳ Events Ready But Not Wired Yet

The following events have full tracking methods but aren't yet integrated into the codebase:

Memory System

  • memory_created
  • memory_retrieved
  • memory_forgotten
  • boundary_set

Where to wire: app/memory/mem0_service.py

Relationship Progression

  • relationship_stage_changed

Where to wire: app/orchestrator/relationship_service.py

Superpowers

  • superpower_triggered
  • superpower_result_sent
  • superpower_accepted
  • superpower_rejected
  • superpower_enabled
  • superpower_disabled

Where to wire: app/superpowers/engine.py, app/superpowers/manager.py

LLM Usage

  • llm_request_started
  • llm_request_completed

Where to wire: app/orchestrator/two_stage_handler.py, app/utils/llm_client.py

Reliability

  • message_delivery_failed
  • superpower_error
  • oauth_refresh_failed

Where to wire: Error handlers in respective modules


Setup Instructions

1. Install Amplitude SDK

pip install amplitude-analytics>=1.1.0

Or add to requirements.txt (already done):

amplitude-analytics>=1.1.0  # Product analytics

2. Configure API Key

Add to .env file:

AMPLITUDE_API_KEY=d56979b4efd0937eea585eefa2c5310c

Or set as environment variable:

export AMPLITUDE_API_KEY=d56979b4efd0937eea585eefa2c5310c

3. Restart Server

# Development
uvicorn app.main:app --reload

# Production (Railway auto-deploys with new env var)
railway up

4. Verify Events in Amplitude

  1. Log into Amplitude dashboard
  2. Navigate to "Events" → "Live Stream"
  3. Send a test message to Sage via iMessage or Telegram
  4. Verify events appear:
  5. user_discovered (if new user)
  6. persona_selected (if first message to persona)
  7. message_sent_user

Usage Examples

Basic Event Tracking

from app.analytics.amplitude_service import get_amplitude_service

amplitude = get_amplitude_service()

# Track a simple event
amplitude.track_event(
    event_name="feature_used",
    user_id="user-uuid",
    properties={
        "feature_name": "calendar_sync",
        "success": True
    }
)

Using the Tracking Helper

from app.analytics.tracking_helper import get_tracking_helper

tracker = get_tracking_helper()

# Track user message (handles user_discovered, persona_selected automatically)
tracker.track_user_message(
    phone="+15551234567",
    text="What's on my calendar today?",
    chat_guid="imessage:chat123",
    channel="imessage",
    persona_id="sage",
    mode="direct",
    participants=["+15551234567"],
    metadata={"detected_intent": "calendar_query"}
)

Update User Properties

from app.analytics.amplitude_service import get_amplitude_service

amplitude = get_amplitude_service()

# Set user properties
amplitude.identify_user(
    user_id="user-uuid",
    properties={
        "plan_tier": "friend",
        "superpowers_enabled_count": 3,
        "relationship_stage": "best_friend"
    }
)

# Increment counter
amplitude.increment_user_property(
    user_id="user-uuid",
    property_name="total_messages_sent",
    delta=1
)

Track LLM Usage

from app.utils.llm_client import calculate_llm_cost
from app.analytics.tracking_helper import get_tracking_helper

tracker = get_tracking_helper()

# After LLM call
tracker.track_llm_usage(
    user_id="user-uuid",
    persona_id="sage",
    channel="telegram",
    model="gpt-4o",
    prompt_type="chat_reply",
    prompt_tokens=1200,
    completion_tokens=150,
    total_tokens=1350,
    cost_usd=calculate_llm_cost("gpt-4o", 1200, 150),
    latency_ms=1847,
    status="success",
    conversation_history_messages=10,
    memories_included=3
)

Privacy & Compliance

✅ What We Track

  • Message counts and lengths
  • Event timestamps
  • Feature usage flags
  • Aggregate metrics (# of calendar events)
  • Performance data (latency, tokens)
  • Error types

❌ What We Never Track

  • Message text content
  • Email subjects or bodies
  • Calendar event titles
  • Names (except persona names)
  • Specific locations
  • OAuth tokens
  • Raw phone numbers (hashed for correlation only)

Data Protection

  • All sensitive identifiers (chat_guid, phone) are SHA-256 hashed
  • Properties sanitized automatically (UUIDs → strings, truncate long values)
  • Analytics failures never crash the application
  • Can be disabled by not setting AMPLITUDE_API_KEY

Debugging

Check if Amplitude is Enabled

from app.analytics.amplitude_service import get_amplitude_service

amplitude = get_amplitude_service()
print(f"Amplitude enabled: {amplitude.enabled}")

View Logs

Analytics operations are logged:

# Look for log messages like:
# "Amplitude analytics initialized successfully"
# "Tracked event message_sent_user for user abc123"
# "Identified user abc123 with properties: {...}"
# "Amplitude tracking error: ..." (if errors occur)

Test Event Delivery

from app.analytics.amplitude_service import get_amplitude_service

amplitude = get_amplitude_service()

# Send test event
amplitude.track_event(
    event_name="test_event",
    user_id="test-user-123",
    properties={"test": True}
)

# Flush immediately (useful for testing)
amplitude.flush()

Architecture

Component Hierarchy

┌─────────────────────────────────────────────┐
│  Application Code (main.py, routes, etc.)  │
└────────────────┬────────────────────────────┘
┌─────────────────────────────────────────────┐
│    TrackingHelper (tracking_helper.py)      │
│  - High-level convenience methods           │
│  - User context management                  │
│  - Automatic property enrichment            │
└────────────────┬────────────────────────────┘
┌─────────────────────────────────────────────┐
│    AmplitudeService (amplitude_service.py)  │
│  - 40+ event tracking methods               │
│  - Property sanitization                    │
│  - Error handling                           │
└────────────────┬────────────────────────────┘
┌─────────────────────────────────────────────┐
│      Amplitude Python SDK (amplitude)       │
│  - HTTP client                              │
│  - Event batching                           │
│  - Network retries                          │
└─────────────────────────────────────────────┘

Key Files

File Purpose Lines
app/analytics/amplitude_service.py Core Amplitude wrapper 800+
app/analytics/tracking_helper.py High-level tracking utilities 350+
app/main.py iMessage/Telegram tracking Modified
app/oauth/routes.py OAuth event tracking Modified
app/utils/llm_client.py Cost calculation helpers Modified
AMPLITUDE_TRACKING_SHEET.md Complete event specification 700+

Amplitude Dashboard Setup

1. Daily Active Users (DAU)

Event: message_sent_user
Unique users by: user_id
Group by: Date

2. Acquisition Funnel

1. user_discovered
2. persona_selected
3. First message_sent_user
4. First integration_connected

3. Feature Adoption

Event: integration_connected
Group by: integration_type
Chart type: Stacked area

4. LLM Cost per User

Event: llm_request_completed
Formula: SUM(llm_cost_usd) / COUNT(DISTINCT user_id)

5. Messaging Volume

Events:
  - message_sent_user (User → Sage)
  - message_sent_companion (Sage → User)
Group by: channel

User Cohorts

Create cohorts for: - Power Users: total_messages_sent > 100 - OAuth Users: calendar_connected = true OR gmail_connected = true - Best Friends: relationship_stage = "best_friend" - Telegram Users: primary_channel = "telegram" - iMessage Users: primary_channel = "imessage"


Next Steps: Completing the Implementation

Phase 2: High-Value Events (1-2 days)

  1. LLM Cost Tracking
  2. Wire up llm_request_completed in two_stage_handler.py
  3. Add to all LLM call sites
  4. Impact: Understand per-user costs, optimize expensive queries

  5. Conversation Sessions

  6. Implement conversation_started / conversation_ended
  7. Track session length and depth
  8. Impact: Measure engagement quality, not just volume

  9. Superpower Triggers

  10. Track superpower_triggered in workflow engine
  11. Track superpower_result_sent on completion
  12. Impact: Feature adoption metrics, workflow effectiveness

Phase 3: Advanced Analytics (2-3 days)

  1. Memory System
  2. Track memory_created, memory_retrieved
  3. Measure memory utilization
  4. Impact: Understand memory system value, optimize retrieval

  5. Relationship Progression

  6. Track relationship_stage_changed
  7. Correlate with retention
  8. Impact: Validate relationship model, improve progression

  9. Superpower Acceptance

  10. Track superpower_accepted / superpower_rejected
  11. Measure user satisfaction with suggestions
  12. Impact: Improve suggestion quality, reduce noise

Phase 4: Reliability & Ops (Ongoing)

  1. Error Tracking
  2. Wire up message_delivery_failed, superpower_error
  3. Create real-time alerts in Amplitude
  4. Impact: Faster incident detection and resolution

Troubleshooting

Events Not Appearing in Amplitude

  1. Check API Key

    echo $AMPLITUDE_API_KEY
    # Should print your API key
    

  2. Check Logs

    # Look for initialization message
    grep "Amplitude" logs/app.log
    
    # Look for tracking errors
    grep "Amplitude tracking error" logs/app.log
    

  3. Verify Initialization

    from app.analytics.amplitude_service import get_amplitude_service
    amplitude = get_amplitude_service()
    print(f"Enabled: {amplitude.enabled}")
    print(f"Client: {amplitude.client}")
    

  4. Test Manually

    # In Python shell
    from app.analytics.amplitude_service import get_amplitude_service
    amplitude = get_amplitude_service()
    
    amplitude.track_event(
        event_name="manual_test",
        user_id="test-123",
        properties={"test": True}
    )
    
    amplitude.flush()  # Force immediate send
    

  5. Check Amplitude Live Stream

  6. Events may take 1-2 minutes to appear
  7. Use "User Lookup" to search for specific user_id
  8. Verify event properties are populated correctly

High Event Volume / Costs

Amplitude pricing is based on Monthly Tracked Users (MTUs), not event volume. Current implementation should stay within free tier (10M events/month) for early stage.

If costs become an issue: 1. Sample Events: Track only 10% of message_sent_user 2. Batch Updates: User properties updated daily instead of real-time 3. Reduce Properties: Remove low-value properties from events

Performance Impact

Analytics tracking is designed to be minimal: - All tracking is async/non-blocking - Errors are caught and logged (never crash app) - SDK batches events automatically - Average overhead: < 5ms per event

If performance issues occur: 1. Disable Amplitude temporarily: Remove AMPLITUDE_API_KEY env var 2. Check for tracking code in hot paths (< 1ms critical sections) 3. Move tracking to background tasks for heavy operations


Support & Resources

Documentation

Code References

  • Event tracking methods: app/analytics/amplitude_service.py
  • Helper utilities: app/analytics/tracking_helper.py
  • Integration examples: app/main.py, app/oauth/routes.py

Questions?

Check existing implementations for patterns, or refer to the tracking sheet for event specifications.


Changelog

v1.0 (2025-11-06)

  • ✅ Core analytics infrastructure
  • ✅ Acquisition & onboarding events
  • ✅ Basic messaging events (inbound only)
  • ✅ OAuth integration events
  • ✅ User properties (core set)
  • ✅ Privacy protection and sanitization
  • ✅ Error handling and graceful degradation

Future Releases

  • v1.1: LLM cost tracking, conversation sessions
  • v1.2: Superpower events, memory tracking
  • v1.3: Relationship progression, advanced analytics
  • v1.4: Reliability events, error tracking
  • v2.0: Billing events (when Stripe launches)

Last Updated: November 6, 2025 Status: Core implementation complete, ready for production testing ✅