Skip to content

ADR 005: Intent-Based MiniApp Routing

Date: 2025-12-02 Status: Implemented Deciders: Engineering Team Related: ADR 004: Mini-App Framework Architecture


Context

We observed a critical bug where users' questions were being incorrectly handled by MiniApp handlers:

Example: - User has active Shanghai trip planning session - User asks: "Sage what are the best Hyatt hotels in Asia?" - Expected: GPT-5 answers the question about hotels - Actual: Trip planner added "Sage what are the best Hyatt hotels in Asia?" as a venue

Root Cause Analysis:

The old routing system asked the wrong question: - Old Question: "Is this message relevant to the active MiniApp?" - Problem: A question about hotels IS topically relevant to a trip, so the classifier returned relevant=True

This is fundamentally incorrect because relevance ≠ actionability.


Decision

Replace the relevance-based classifier with an Intent-Based Router that asks:

"Does the user want to PERFORM AN ACTION or ASK A QUESTION?"

Intent Classification Taxonomy

Intent Type Description Routing
CONVERSATIONAL Questions, opinions, information seeking, general chat → GPT-5 (always)
TRANSACTIONAL Add, remove, modify, record actions → MiniApp (if session active)
NAVIGATIONAL View, list, check status → MiniApp (if session active)
META Session control (cancel, done, help) → Context-dependent

Key Insight

CONVERSATIONAL intent ALWAYS routes to GPT-5, even if the topic is related to an active MiniApp.

This is the critical distinction the old system missed.


Implementation

New Component: IntentRouter

File: app/orchestrator/intent_router.py

class IntentRouter:
    """
    Unified intent router for message classification.

    Uses gpt-5-nano for fast, accurate intent classification with
    explicit understanding of the difference between:
    - Seeking information (CONVERSATIONAL → GPT-5)
    - Performing actions (TRANSACTIONAL/NAVIGATIONAL → MiniApp)
    """

    async def classify_and_route(
        self,
        message: str,
        active_miniapp_id: Optional[str],
        session_data: Optional[Dict[str, Any]],
        conversation_history: Optional[List[Dict[str, str]]] = None,
    ) -> RoutingDecision:
        ...

Routing Decision Flow

Message Received
┌──────────────────┐
│ New MiniApp      │ YES ──► Create session, route to MiniApp
│ Trigger Detected?│
└────────┬─────────┘
         │ NO
┌──────────────────┐
│ Active MiniApp   │ NO ──► Route to GPT-5
│ Session?         │
└────────┬─────────┘
         │ YES
┌──────────────────┐
│ IntentRouter     │
│ Classification   │
└────────┬─────────┘
    ┌────┴────┐
    ▼         ▼
CONVERSATIONAL   TRANSACTIONAL/
    │           NAVIGATIONAL/META
    ▼                ▼
 GPT-5           MiniApp

Fast Path Optimizations

The IntentRouter includes fast paths that skip the LLM call (~100ms savings):

  1. Direct assistant address without action keywords:
  2. "Sage what are the best hotels?" → CONVERSATIONAL (no LLM needed)
  3. "Sage add Bar Mood" → LLM classifies (has "add" keyword)

  4. Obvious conversational patterns:

  5. "What's the weather", "How are you", "Tell me about..." → CONVERSATIONAL

  6. Session control keywords:

  7. "Cancel", "Stop", "Done" → META → MiniApp

  8. No active session:

  9. Any message without active MiniApp → GPT-5 (let trigger detection handle new sessions)

LLM Prompt Design

The system prompt explicitly teaches the distinction:

The key question is NOT "is this message about {app_name}?"
The key question IS "does the user want to PERFORM AN ACTION or are they ASKING A QUESTION?"

### CONVERSATIONAL Intent (→ General Assistant)
User wants information, opinions, advice, or general chat.
Even if the topic relates to {app_name}, informational questions go to the general assistant.

Examples of CONVERSATIONAL (even during active trip planning):
- "What are the best hotels in Shanghai?" (seeking information)
- "Is the Bund worth visiting?" (seeking opinion)
- "Tell me about Shanghai's food scene" (seeking knowledge)

Alternatives Considered

Option 1: Expand Escape Patterns (Quick Fix)

Approach: Add more keyword patterns to escape MiniApp routing

escape_patterns = [
    "sage ", "hey sage", "@sage",
    "what are the best", "where should i",
    ...
]

Pros: - Fast to implement - No LLM cost

Cons: - Brittle - endless pattern expansion - False positives/negatives - Doesn't solve the fundamental problem

Verdict: ❌ Rejected - band-aid, not solution

Option 2: Confidence Threshold Adjustment

Approach: Only route to MiniApp if classifier confidence > 80%

Pros: - Simple change

Cons: - Still using wrong question (relevance) - Doesn't distinguish questions from actions - Arbitrary threshold

Verdict: ❌ Rejected - wrong abstraction

Option 3: Two-Stage Classification

Approach: First classify intent type, then classify relevance

Pros: - Most accurate - Clear separation of concerns

Cons: - Double LLM latency (~200ms) - Increased cost

Verdict: ⚠️ Considered but simplified - combined into single prompt

Option 4: Intent-Based Router (CHOSEN)

Approach: Single LLM call with intent-focused prompt

Pros: - ✅ Asks the right question (intent, not relevance) - ✅ Single LLM call (~100ms) - ✅ Fast paths for obvious cases (0ms) - ✅ Extensible to new intent types - ✅ Fail-safe (defaults to CONVERSATIONAL)

Cons: - ⚠️ Requires good prompt engineering - ⚠️ Still has LLM latency for ambiguous cases

Verdict:Best balance of accuracy and performance


Examples: Before vs After

Message Active App Old Behavior New Behavior
"What are the best Hyatt hotels in Asia?" trip_planner ❌ Added as venue ✅ GPT-5 answers
"Sage add Bar Mood" trip_planner ✅ Added ✅ Added (TRANSACTIONAL)
"Show my venues" trip_planner ✅ Showed list ✅ Showed list (NAVIGATIONAL)
"Should I visit the Bund?" trip_planner ❌ Might add ✅ GPT-5 gives opinion
"Is dim sum good there?" trip_planner ❌ Might add ✅ GPT-5 answers
"Cancel" trip_planner ✅ Ended session ✅ Ended session (META)
"Sage help me plan" None ✅ Started trip ✅ Started trip (trigger detection)

Consequences

Positive

  1. Correct Routing: Questions now go to GPT-5, actions go to MiniApp
  2. Better UX: Users can ask questions mid-session without confusion
  3. Principled Design: Intent-first is the right abstraction
  4. Extensible: Easy to add new intent types (CLARIFICATION, FEEDBACK, etc.)
  5. Fail-Safe: Defaults to CONVERSATIONAL when uncertain

Negative

  1. LLM Dependency: Still requires LLM for ambiguous cases (~100ms)
  2. Prompt Sensitivity: Quality depends on prompt engineering
  3. New Complexity: Another classifier to maintain

Neutral

  1. Performance: Similar latency to old system (fast paths + LLM fallback)
  2. Cost: Same LLM calls, just better prompts

Files Changed

New Files

  • app/orchestrator/intent_router.py - IntentRouter class with classification logic

Modified Files

  • app/orchestrator/miniapp_router.py - Now uses IntentRouter instead of relevance_classifier

Deprecated (Not Removed)

  • app/miniapps/routing/relevance_classifier.py - Old classifier, kept for reference

Validation

Test Cases

  1. General question with active trip:
  2. Input: "What are the best hotels in Shanghai?"
  3. Expected: CONVERSATIONAL → GPT-5
  4. Result: ✅ Passed

  5. Action command with active trip:

  6. Input: "Add Park Hyatt to my list"
  7. Expected: TRANSACTIONAL → MiniApp
  8. Result: ✅ Passed

  9. View command with active trip:

  10. Input: "Show me my venues"
  11. Expected: NAVIGATIONAL → MiniApp
  12. Result: ✅ Passed

  13. Session control:

  14. Input: "Cancel"
  15. Expected: META → MiniApp (end session)
  16. Result: ✅ Passed

  17. Opinion question:

  18. Input: "Should I visit the Bund?"
  19. Expected: CONVERSATIONAL → GPT-5
  20. Result: ✅ Passed

Revision History

  • 2025-12-02: Initial implementation after bug discovery
  • Fixed "Sage what are the best Hyatt hotels in Asia?" being added as venue