LLM-Based Entity Extraction: Architecture Refactoring Proposal¶

Author: Claude Code Date: December 6, 2025 Status: ✅ Implemented (Phases 1-4 Complete) Priority: High

Implementation Status¶

Phase	Description	Status
Phase 1	Add Entity Extractor Service	✅ Complete
Phase 2	Wire into SmartMessageRouter	✅ Complete
Phase 3	Refactor TripPlannerHandler	✅ Complete
Phase 4	Remove Redundant Patterns	✅ Complete
Phase 5	Extend to Other MiniApps	🔜 Future

Key Files Created: - app/orchestrator/entities.py - Entity dataclasses - app/orchestrator/entity_extractor.py - LLM extraction service

Key Files Modified: - app/orchestrator/smart_message_router.py - Calls EntityExtractor - app/miniapps/handlers/trip_planner.py - Uses extracted entities - app/miniapps/routing/command_parser.py - Deprecated (fallback only)

Executive Summary¶

Replace fragmented regex/keyword matching across 4+ files (~70+ patterns) with a unified LLM-based entity extraction layer using gpt-5-nano. This will:

Reduce maintenance burden - One prompt template vs 70+ regex patterns
Improve accuracy - LLM understands context, synonyms, and variations
Handle edge cases naturally - No more "add a pattern for this new phrasing"
Enable split-thought entry - LLM understands "It's in Shinjuku" refers to last venue
Future-proof the system - New entity types don't require code changes

Part 1: Current State Analysis¶

Files with Regex/Keyword Matching¶

File	Pattern Count	Purpose
`intent_router.py`	~15 patterns	Routing decisions
`trip_planner.py`	~35 patterns	Command matching, metadata extraction
`venue_resolver.py`	~12 patterns	Venue name/metadata parsing
`command_parser.py`	Already LLM	Command classification (blueprint)

Pattern Categories¶

1. Command Detection (IntentRouter + TripPlanner)

# These patterns decide what action to take
"delete that", "remove it", "take off"  → delete_venue
"rate it 5 stars", "5/5"                → rate_venue
"mark that as visited"                  → mark_visited
"remind me what we planned"             → recall_trip

2. Entity Extraction (VenueResolver + TripPlanner)

# These patterns extract structured data
"in Shinjuku"                           → district: "Shinjuku"
"Seongsu district"                      → district: "Seongsu"
"try the xiaolongbao"                   → must_try: ["xiaolongbao"]
"great for dinner"                      → best_for: ["dinner"]

3. Metadata Updates (TripPlanner)

# Follow-up messages about the last venue
"It's in Shinjuku"                      → Update last venue district
"Supposed to have amazing cocktails"    → Update last venue must_try

4. Query Filters (TripPlanner + QueryEngine)

"breakfast spots?"                      → best_for filter: "morning"
"what's in Shibuya?"                    → district filter: "Shibuya"
"unvisited places"                      → status filter: "unvisited"

Problems with Current Approach¶

Fragile - Each new phrasing requires new regex
Duplicated - Same patterns in IntentRouter AND TripPlanner
No Context - Regex can't understand "that" refers to last venue
Hard to Debug - Which of 70 patterns matched?
Expensive to Maintain - ~5 files need updating for new features

Part 2: Proposed Architecture¶

Core Concept: Structured Entity Extraction¶

Instead of regex, use gpt-5-nano to extract entities into typed dataclasses:

@dataclass
class VenueEntity:
    """Extracted from user message."""
    name: Optional[str] = None           # "Bar Benfiddich"
    district: Optional[str] = None       # "Shinjuku"
    category: Optional[str] = None       # "bar", "restaurant", "cafe"
    must_try: List[str] = field(default_factory=list)  # ["herbal cocktails"]
    vibe: List[str] = field(default_factory=list)      # ["speakeasy", "intimate"]
    best_for: List[str] = field(default_factory=list)  # ["dinner", "date night"]
    rating: Optional[int] = None         # 1-5
    source_note: Optional[str] = None    # "from TikTok"

    # Action context
    action: str = "add"                  # add, update, delete, rate, mark_visited
    references_last: bool = False        # True if "that", "it", "this" used

@dataclass
class QueryEntity:
    """Extracted from query messages."""
    query_type: str = "list_all"         # list_all, filter, search
    filters: Dict[str, Any] = field(default_factory=dict)
    # filters: {"district": "Shinjuku", "best_for": "dinner", "status": "unvisited"}

@dataclass
class TripEntity:
    """Extracted for trip-level operations."""
    destination: Optional[str] = None
    action: str = "set_destination"      # set_destination, recall, end_session

The Entity Extractor Service¶

class EntityExtractor:
    """
    LLM-based entity extraction using gpt-5-nano.

    Single entry point for all entity extraction, replacing:
    - VenueResolver.parse_user_input()
    - TripPlannerHandler._looks_like_metadata_update()
    - TripPlannerHandler._extract_* methods
    - IntentRouter fast-path patterns
    """

    def __init__(self):
        self.llm = LLMClient(model="gpt-5-nano")

    async def extract(
        self,
        message: str,
        context: EntityExtractionContext,
    ) -> ExtractedEntities:
        """
        Extract structured entities from a message.

        Args:
            message: User's message
            context: Session context (last venue, destination, etc.)

        Returns:
            ExtractedEntities with venues, queries, and trip operations
        """
        prompt = self._build_prompt(message, context)
        response = self.llm.generate_response(
            system_prompt=ENTITY_EXTRACTION_SYSTEM_PROMPT,
            user_message=prompt,
            max_tokens=300,
        )
        return self._parse_response(response)

System Prompt Template¶

ENTITY_EXTRACTION_SYSTEM_PROMPT = """You are an entity extractor for a trip planning assistant.

Given a user message and context, extract structured entities.

## Context You'll Receive
- Current destination (e.g., "Tokyo")
- Last added venue (for pronoun resolution)
- Session state

## Entity Types to Extract

### VenueEntity (when adding/updating venues)
- name: Venue name (or null if updating last venue via pronoun)
- district: Neighborhood/area
- category: restaurant, bar, cafe, shop, landmark, etc.
- must_try: Foods/drinks to try
- vibe: Descriptors (trendy, cozy, romantic, etc.)
- best_for: Time/occasion (morning, dinner, photos, etc.)
- rating: 1-5 stars if mentioned
- action: "add", "update", "delete", "rate", "mark_visited"
- references_last: true if message uses "that", "it", "this", "there"

### QueryEntity (when asking about venues)
- query_type: "list_all", "filter", "search"
- filters: {district, category, best_for, status, vibe}

### TripEntity (for trip-level operations)
- destination: City/country
- action: "set_destination", "recall", "end_session"

## Important Rules

1. **Pronoun Resolution**: "It's in Shinjuku" → references_last=true, district="Shinjuku"
2. **Split Thoughts**: "Supposed to have great cocktails" → update action, references_last=true
3. **Bulk Import**: Multiple venues → return array of VenueEntity
4. **Category Inference**: "coffee shop" → category="cafe"
5. **Time Inference**: "dinner spot" → best_for=["dinner"]

## Response Format
```json
{
  "entities": {
    "venues": [...],  // Array of VenueEntity
    "query": {...},   // QueryEntity if asking about venues
    "trip": {...}     // TripEntity if trip-level operation
  },
  "primary_action": "add_venue" | "update_venue" | "query" | "set_destination" | ...,
  "confidence": 0.0-1.0
}
```"""

Part 3: Integration Points¶

A. Replace IntentRouter Fast-Paths¶

Before:

# intent_router.py - 50+ lines of patterns
recall_patterns = ["remind me what we", "what did we add", ...]
if any(p in message_lower for p in recall_patterns):
    return RoutingDecision(...)

district_patterns = [r"(?:what'?s|spots|places)\s+(?:in|near)\s+\w+", ...]
if any(re.search(p, message_lower) for p in district_patterns):
    return RoutingDecision(...)

After:

# intent_router.py - 5 lines
entities = await entity_extractor.extract(message, context)
if entities.query:
    return RoutingDecision(
        target=RoutingTarget.MINIAPP,
        detected_action=entities.primary_action,
        entities=entities.to_dict(),
    )

B. Replace TripPlannerHandler Parsing¶

Before:

# trip_planner.py - 200+ lines of regex
if message_lower.startswith("trip to "):
    destination = self._extract_destination(message[8:])

if trip.last_interacted_venue_id and self._looks_like_metadata_update(message):
    response = self._append_metadata_to_last(...)

visited_match = re.match(r"^(visited|went to|checked out)\s+(.+)", message_lower)
rate_match = re.match(r"^(\d)\s*(?:star|stars|/5|\*)(?:\s+(?:for|to)\s+(.+))?", ...)

After:

# trip_planner.py - 20 lines
entities = context.routing_entities  # Pre-extracted by SmartMessageRouter

if entities.venues:
    for venue in entities.venues:
        if venue.references_last:
            self._update_last_venue(trip, venue)
        elif venue.action == "add":
            await self._add_venue_from_entity(trip, venue)
        elif venue.action == "delete":
            self._delete_venue(trip, venue.name)
        elif venue.action == "rate":
            self._rate_venue(trip, venue.name, venue.rating)

C. Replace VenueResolver.parse_user_input()¶

Before:

# venue_resolver.py - 100+ lines
parts = re.split(r'\s+[-–—]\s+|\s*,\s+', text, maxsplit=1)
location_match = re.search(r'\b(?:in|at|near)\s+([A-Za-z\s]+?)...', extra)
district_suffix_match = re.search(r'\b([A-Za-z]+)\s+(?:district|area|...)...', extra)
try_start = re.search(r'\b(?:must try|try|get|order|recommend)[:\s]+', extra)

After:

# venue_resolver.py - 10 lines
async def parse_user_input(self, text: str, context: EntityExtractionContext) -> VenueEntity:
    entities = await entity_extractor.extract(text, context)
    if entities.venues:
        return entities.venues[0]
    return VenueEntity(name=text)  # Fallback: treat whole text as name

Part 4: Migration Strategy¶

Phase 1: Add Entity Extractor Service (Week 1)¶

Create app/orchestrator/entity_extractor.py
Define entity dataclasses
Implement extraction with gpt-5-nano
Add comprehensive unit tests

Phase 2: Wire into SmartMessageRouter (Week 2)¶

SmartMessageRouter already has entities dict
Replace generic dict with typed EntityExtractionResult
Pass to handlers via MiniAppContext.routing_entities

Phase 3: Refactor TripPlannerHandler (Week 2-3)¶

Use pre-extracted entities instead of regex
Keep regex as fallback for 2 weeks
Monitor accuracy metrics
Remove regex after validation

Phase 4: Remove Redundant Patterns (Week 3)¶

Remove fast-path regex from IntentRouter
Remove parse_user_input regex from VenueResolver
Remove _looks_like_metadata_update from TripPlannerHandler
Clean up command_parser.py (now redundant)

Phase 5: Extend to Other MiniApps (Week 4+)¶

BillSplit: Extract person, amount, description entities
TodoList: Extract task, priority, assignee entities
Poll: Extract question, options entities

Part 5: Cost & Latency Analysis¶

gpt-5-nano Characteristics¶

Latency: ~30-50ms per call
Cost: ~$0.0001 per 1K tokens
Accuracy: Sufficient for entity extraction (not reasoning)

Current Regex Overhead¶

Pattern matching: ~1-5ms
But: Multiple passes through message (~5-10ms total)
Plus: Maintenance cost, debugging time

Net Impact¶

Latency: +30-40ms per message (acceptable)
Accuracy: Significant improvement for edge cases
Maintenance: 80% reduction in routing code

Part 6: Example Transformations¶

Example 1: Split Thought Entry¶

Messages: 1. "Add Bar Benfiddich" 2. "It's in Shinjuku" 3. "Supposed to have amazing herbal cocktails"

Extracted Entities:

// Message 1
{"venues": [{"name": "Bar Benfiddich", "action": "add"}]}

// Message 2
{"venues": [{"district": "Shinjuku", "action": "update", "references_last": true}]}

// Message 3
{"venues": [{"must_try": ["herbal cocktails"], "action": "update", "references_last": true}]}

Example 2: Bulk Import¶

Message:

1. Nudake - black pastries, Seongsu district
2. Onion Anguk - hanok cafe
3. Tosokchon - samgyetang, near Gyeongbokgung

Extracted Entities:

{
  "venues": [
    {"name": "Nudake", "must_try": ["black pastries"], "district": "Seongsu"},
    {"name": "Onion Anguk", "vibe": ["hanok"], "category": "cafe"},
    {"name": "Tosokchon", "must_try": ["samgyetang"], "district": "Gyeongbokgung"}
  ],
  "primary_action": "add_multiple_venues"
}

Example 3: Query with Filters¶

Message: "any dinner spots in Shibuya we haven't tried?"

Extracted Entities:

{
  "query": {
    "query_type": "filter",
    "filters": {
      "best_for": "dinner",
      "district": "Shibuya",
      "status": "unvisited"
    }
  },
  "primary_action": "query"
}

Part 7: Risk Mitigation¶

Risk 1: LLM Latency Spikes¶

Mitigation: - Use gpt-5-nano (fastest model) - Implement timeout (500ms) with regex fallback - Cache common patterns

Risk 2: Extraction Errors¶

Mitigation: - Keep regex as fallback for 2 weeks - Log all extractions for analysis - A/B test accuracy before full rollout

Risk 3: Cost Increase¶

Mitigation: - gpt-5-nano is extremely cheap (~$0.0001/1K tokens) - One nano call vs multiple mini calls = net savings

Part 8: Success Metrics¶

Metric	Current	Target
Lines of regex code	~500	~50
Edge cases requiring code changes	Weekly	Monthly
QA scenario pass rate	80%	95%
Split-thought success rate	70%	95%
Bulk import success rate	80%	98%

Appendix: Files to Modify¶

NEW: app/orchestrator/entity_extractor.py
NEW: app/orchestrator/entities.py (dataclasses)
MODIFY: app/orchestrator/smart_message_router.py (use EntityExtractor)
MODIFY: app/miniapps/handlers/trip_planner.py (use pre-extracted entities)
MODIFY: app/miniapps/components/venue_resolver.py (use EntityExtractor)
DEPRECATE: app/orchestrator/intent_router.py fast-paths
DEPRECATE: app/miniapps/routing/command_parser.py (merged into EntityExtractor)

Recommendation: Proceed with Phase 1 implementation. The SmartMessageRouter already has the right architecture; we just need to enhance entity extraction and wire it through the system.