LLM-Based Entity Extraction: Architecture Refactoring Proposal¶
Author: Claude Code Date: December 6, 2025 Status: ✅ Implemented (Phases 1-4 Complete) Priority: High
Implementation Status¶
| Phase | Description | Status |
|---|---|---|
| Phase 1 | Add Entity Extractor Service | ✅ Complete |
| Phase 2 | Wire into SmartMessageRouter | ✅ Complete |
| Phase 3 | Refactor TripPlannerHandler | ✅ Complete |
| Phase 4 | Remove Redundant Patterns | ✅ Complete |
| Phase 5 | Extend to Other MiniApps | 🔜 Future |
Key Files Created:
- app/orchestrator/entities.py - Entity dataclasses
- app/orchestrator/entity_extractor.py - LLM extraction service
Key Files Modified:
- app/orchestrator/smart_message_router.py - Calls EntityExtractor
- app/miniapps/handlers/trip_planner.py - Uses extracted entities
- app/miniapps/routing/command_parser.py - Deprecated (fallback only)
Executive Summary¶
Replace fragmented regex/keyword matching across 4+ files (~70+ patterns) with a unified LLM-based entity extraction layer using gpt-5-nano. This will:
- Reduce maintenance burden - One prompt template vs 70+ regex patterns
- Improve accuracy - LLM understands context, synonyms, and variations
- Handle edge cases naturally - No more "add a pattern for this new phrasing"
- Enable split-thought entry - LLM understands "It's in Shinjuku" refers to last venue
- Future-proof the system - New entity types don't require code changes
Part 1: Current State Analysis¶
Files with Regex/Keyword Matching¶
| File | Pattern Count | Purpose |
|---|---|---|
intent_router.py |
~15 patterns | Routing decisions |
trip_planner.py |
~35 patterns | Command matching, metadata extraction |
venue_resolver.py |
~12 patterns | Venue name/metadata parsing |
command_parser.py |
Already LLM | Command classification (blueprint) |
Pattern Categories¶
1. Command Detection (IntentRouter + TripPlanner)
# These patterns decide what action to take
"delete that", "remove it", "take off" → delete_venue
"rate it 5 stars", "5/5" → rate_venue
"mark that as visited" → mark_visited
"remind me what we planned" → recall_trip
2. Entity Extraction (VenueResolver + TripPlanner)
# These patterns extract structured data
"in Shinjuku" → district: "Shinjuku"
"Seongsu district" → district: "Seongsu"
"try the xiaolongbao" → must_try: ["xiaolongbao"]
"great for dinner" → best_for: ["dinner"]
3. Metadata Updates (TripPlanner)
# Follow-up messages about the last venue
"It's in Shinjuku" → Update last venue district
"Supposed to have amazing cocktails" → Update last venue must_try
4. Query Filters (TripPlanner + QueryEngine)
"breakfast spots?" → best_for filter: "morning"
"what's in Shibuya?" → district filter: "Shibuya"
"unvisited places" → status filter: "unvisited"
Problems with Current Approach¶
- Fragile - Each new phrasing requires new regex
- Duplicated - Same patterns in IntentRouter AND TripPlanner
- No Context - Regex can't understand "that" refers to last venue
- Hard to Debug - Which of 70 patterns matched?
- Expensive to Maintain - ~5 files need updating for new features
Part 2: Proposed Architecture¶
Core Concept: Structured Entity Extraction¶
Instead of regex, use gpt-5-nano to extract entities into typed dataclasses:
@dataclass
class VenueEntity:
"""Extracted from user message."""
name: Optional[str] = None # "Bar Benfiddich"
district: Optional[str] = None # "Shinjuku"
category: Optional[str] = None # "bar", "restaurant", "cafe"
must_try: List[str] = field(default_factory=list) # ["herbal cocktails"]
vibe: List[str] = field(default_factory=list) # ["speakeasy", "intimate"]
best_for: List[str] = field(default_factory=list) # ["dinner", "date night"]
rating: Optional[int] = None # 1-5
source_note: Optional[str] = None # "from TikTok"
# Action context
action: str = "add" # add, update, delete, rate, mark_visited
references_last: bool = False # True if "that", "it", "this" used
@dataclass
class QueryEntity:
"""Extracted from query messages."""
query_type: str = "list_all" # list_all, filter, search
filters: Dict[str, Any] = field(default_factory=dict)
# filters: {"district": "Shinjuku", "best_for": "dinner", "status": "unvisited"}
@dataclass
class TripEntity:
"""Extracted for trip-level operations."""
destination: Optional[str] = None
action: str = "set_destination" # set_destination, recall, end_session
The Entity Extractor Service¶
class EntityExtractor:
"""
LLM-based entity extraction using gpt-5-nano.
Single entry point for all entity extraction, replacing:
- VenueResolver.parse_user_input()
- TripPlannerHandler._looks_like_metadata_update()
- TripPlannerHandler._extract_* methods
- IntentRouter fast-path patterns
"""
def __init__(self):
self.llm = LLMClient(model="gpt-5-nano")
async def extract(
self,
message: str,
context: EntityExtractionContext,
) -> ExtractedEntities:
"""
Extract structured entities from a message.
Args:
message: User's message
context: Session context (last venue, destination, etc.)
Returns:
ExtractedEntities with venues, queries, and trip operations
"""
prompt = self._build_prompt(message, context)
response = self.llm.generate_response(
system_prompt=ENTITY_EXTRACTION_SYSTEM_PROMPT,
user_message=prompt,
max_tokens=300,
)
return self._parse_response(response)
System Prompt Template¶
ENTITY_EXTRACTION_SYSTEM_PROMPT = """You are an entity extractor for a trip planning assistant.
Given a user message and context, extract structured entities.
## Context You'll Receive
- Current destination (e.g., "Tokyo")
- Last added venue (for pronoun resolution)
- Session state
## Entity Types to Extract
### VenueEntity (when adding/updating venues)
- name: Venue name (or null if updating last venue via pronoun)
- district: Neighborhood/area
- category: restaurant, bar, cafe, shop, landmark, etc.
- must_try: Foods/drinks to try
- vibe: Descriptors (trendy, cozy, romantic, etc.)
- best_for: Time/occasion (morning, dinner, photos, etc.)
- rating: 1-5 stars if mentioned
- action: "add", "update", "delete", "rate", "mark_visited"
- references_last: true if message uses "that", "it", "this", "there"
### QueryEntity (when asking about venues)
- query_type: "list_all", "filter", "search"
- filters: {district, category, best_for, status, vibe}
### TripEntity (for trip-level operations)
- destination: City/country
- action: "set_destination", "recall", "end_session"
## Important Rules
1. **Pronoun Resolution**: "It's in Shinjuku" → references_last=true, district="Shinjuku"
2. **Split Thoughts**: "Supposed to have great cocktails" → update action, references_last=true
3. **Bulk Import**: Multiple venues → return array of VenueEntity
4. **Category Inference**: "coffee shop" → category="cafe"
5. **Time Inference**: "dinner spot" → best_for=["dinner"]
## Response Format
```json
{
"entities": {
"venues": [...], // Array of VenueEntity
"query": {...}, // QueryEntity if asking about venues
"trip": {...} // TripEntity if trip-level operation
},
"primary_action": "add_venue" | "update_venue" | "query" | "set_destination" | ...,
"confidence": 0.0-1.0
}
```"""
Part 3: Integration Points¶
A. Replace IntentRouter Fast-Paths¶
Before:
# intent_router.py - 50+ lines of patterns
recall_patterns = ["remind me what we", "what did we add", ...]
if any(p in message_lower for p in recall_patterns):
return RoutingDecision(...)
district_patterns = [r"(?:what'?s|spots|places)\s+(?:in|near)\s+\w+", ...]
if any(re.search(p, message_lower) for p in district_patterns):
return RoutingDecision(...)
After:
# intent_router.py - 5 lines
entities = await entity_extractor.extract(message, context)
if entities.query:
return RoutingDecision(
target=RoutingTarget.MINIAPP,
detected_action=entities.primary_action,
entities=entities.to_dict(),
)
B. Replace TripPlannerHandler Parsing¶
Before:
# trip_planner.py - 200+ lines of regex
if message_lower.startswith("trip to "):
destination = self._extract_destination(message[8:])
if trip.last_interacted_venue_id and self._looks_like_metadata_update(message):
response = self._append_metadata_to_last(...)
visited_match = re.match(r"^(visited|went to|checked out)\s+(.+)", message_lower)
rate_match = re.match(r"^(\d)\s*(?:star|stars|/5|\*)(?:\s+(?:for|to)\s+(.+))?", ...)
After:
# trip_planner.py - 20 lines
entities = context.routing_entities # Pre-extracted by SmartMessageRouter
if entities.venues:
for venue in entities.venues:
if venue.references_last:
self._update_last_venue(trip, venue)
elif venue.action == "add":
await self._add_venue_from_entity(trip, venue)
elif venue.action == "delete":
self._delete_venue(trip, venue.name)
elif venue.action == "rate":
self._rate_venue(trip, venue.name, venue.rating)
C. Replace VenueResolver.parse_user_input()¶
Before:
# venue_resolver.py - 100+ lines
parts = re.split(r'\s+[-–—]\s+|\s*,\s+', text, maxsplit=1)
location_match = re.search(r'\b(?:in|at|near)\s+([A-Za-z\s]+?)...', extra)
district_suffix_match = re.search(r'\b([A-Za-z]+)\s+(?:district|area|...)...', extra)
try_start = re.search(r'\b(?:must try|try|get|order|recommend)[:\s]+', extra)
After:
# venue_resolver.py - 10 lines
async def parse_user_input(self, text: str, context: EntityExtractionContext) -> VenueEntity:
entities = await entity_extractor.extract(text, context)
if entities.venues:
return entities.venues[0]
return VenueEntity(name=text) # Fallback: treat whole text as name
Part 4: Migration Strategy¶
Phase 1: Add Entity Extractor Service (Week 1)¶
- Create
app/orchestrator/entity_extractor.py - Define entity dataclasses
- Implement extraction with gpt-5-nano
- Add comprehensive unit tests
Phase 2: Wire into SmartMessageRouter (Week 2)¶
- SmartMessageRouter already has entities dict
- Replace generic dict with typed EntityExtractionResult
- Pass to handlers via MiniAppContext.routing_entities
Phase 3: Refactor TripPlannerHandler (Week 2-3)¶
- Use pre-extracted entities instead of regex
- Keep regex as fallback for 2 weeks
- Monitor accuracy metrics
- Remove regex after validation
Phase 4: Remove Redundant Patterns (Week 3)¶
- Remove fast-path regex from IntentRouter
- Remove parse_user_input regex from VenueResolver
- Remove _looks_like_metadata_update from TripPlannerHandler
- Clean up command_parser.py (now redundant)
Phase 5: Extend to Other MiniApps (Week 4+)¶
- BillSplit: Extract person, amount, description entities
- TodoList: Extract task, priority, assignee entities
- Poll: Extract question, options entities
Part 5: Cost & Latency Analysis¶
gpt-5-nano Characteristics¶
- Latency: ~30-50ms per call
- Cost: ~$0.0001 per 1K tokens
- Accuracy: Sufficient for entity extraction (not reasoning)
Current Regex Overhead¶
- Pattern matching: ~1-5ms
- But: Multiple passes through message (~5-10ms total)
- Plus: Maintenance cost, debugging time
Net Impact¶
- Latency: +30-40ms per message (acceptable)
- Accuracy: Significant improvement for edge cases
- Maintenance: 80% reduction in routing code
Part 6: Example Transformations¶
Example 1: Split Thought Entry¶
Messages: 1. "Add Bar Benfiddich" 2. "It's in Shinjuku" 3. "Supposed to have amazing herbal cocktails"
Extracted Entities:
// Message 1
{"venues": [{"name": "Bar Benfiddich", "action": "add"}]}
// Message 2
{"venues": [{"district": "Shinjuku", "action": "update", "references_last": true}]}
// Message 3
{"venues": [{"must_try": ["herbal cocktails"], "action": "update", "references_last": true}]}
Example 2: Bulk Import¶
Message:
1. Nudake - black pastries, Seongsu district
2. Onion Anguk - hanok cafe
3. Tosokchon - samgyetang, near Gyeongbokgung
Extracted Entities:
{
"venues": [
{"name": "Nudake", "must_try": ["black pastries"], "district": "Seongsu"},
{"name": "Onion Anguk", "vibe": ["hanok"], "category": "cafe"},
{"name": "Tosokchon", "must_try": ["samgyetang"], "district": "Gyeongbokgung"}
],
"primary_action": "add_multiple_venues"
}
Example 3: Query with Filters¶
Message: "any dinner spots in Shibuya we haven't tried?"
Extracted Entities:
{
"query": {
"query_type": "filter",
"filters": {
"best_for": "dinner",
"district": "Shibuya",
"status": "unvisited"
}
},
"primary_action": "query"
}
Part 7: Risk Mitigation¶
Risk 1: LLM Latency Spikes¶
Mitigation: - Use gpt-5-nano (fastest model) - Implement timeout (500ms) with regex fallback - Cache common patterns
Risk 2: Extraction Errors¶
Mitigation: - Keep regex as fallback for 2 weeks - Log all extractions for analysis - A/B test accuracy before full rollout
Risk 3: Cost Increase¶
Mitigation: - gpt-5-nano is extremely cheap (~$0.0001/1K tokens) - One nano call vs multiple mini calls = net savings
Part 8: Success Metrics¶
| Metric | Current | Target |
|---|---|---|
| Lines of regex code | ~500 | ~50 |
| Edge cases requiring code changes | Weekly | Monthly |
| QA scenario pass rate | 80% | 95% |
| Split-thought success rate | 70% | 95% |
| Bulk import success rate | 80% | 98% |
Appendix: Files to Modify¶
- NEW:
app/orchestrator/entity_extractor.py - NEW:
app/orchestrator/entities.py(dataclasses) - MODIFY:
app/orchestrator/smart_message_router.py(use EntityExtractor) - MODIFY:
app/miniapps/handlers/trip_planner.py(use pre-extracted entities) - MODIFY:
app/miniapps/components/venue_resolver.py(use EntityExtractor) - DEPRECATE:
app/orchestrator/intent_router.pyfast-paths - DEPRECATE:
app/miniapps/routing/command_parser.py(merged into EntityExtractor)
Recommendation: Proceed with Phase 1 implementation. The SmartMessageRouter already has the right architecture; we just need to enhance entity extraction and wire it through the system.