Query / Message Processing Architecture¶
Last Verified: February 2026
Scope: Backend message ingest + orchestration pipeline (this repo)
This backend processes user messages primarily via the Mac mini edge relay (../archety-edge) and optionally via a legacy relay integration.
1) Ingress entry points¶
A. Edge relay (recommended): POST /edge/message¶
- Who calls it: Mac mini edge relay (
../archety-edge) - Auth:
Authorization: Bearer <edge_token>(seeapp/edge/auth.py) - Input:
FilteredMessage(supports both legacy + modern shapes; seeapp/edge/schemas.py) - Why it exists: the edge relay pre-filters/redacts and can maintain local context/scheduling.
Implementation: app/api/edge_routes.py
B. Legacy relay: POST /orchestrator/message¶
- Who calls it: older relay integration
- Auth:
Authorization: Bearer <RELAY_WEBHOOK_SECRET>(if configured) - Input:
OrchestratorRequest(seeapp/models/schemas.py)
Implementation: app/main.py
2) Core orchestrator: MessageHandler¶
Both ingress paths ultimately route into:
- app/orchestrator/message_handler.py::MessageHandler.handle_message_async()
At a high level:
- Fast reflex (optional): regex/deterministic “ack” to reduce perceived latency
app/messaging/fast_reflex_detector.py-
app/messaging/reflex_library.py -
Safety + gating
- Subscription/message limits:
app/services/subscription_access_service.py -
Output moderation:
app/safety/(called from ingress handlers) -
Context build
- Conversation history:
app/orchestrator/conversation_history_service.py - Memory retrieval:
app/memory/(SupermemoryServicedefault; mem0 fallback viaget_memory_service()) -
Relationship state:
app/orchestrator/relationship_service_sql.py -
Routing (single decision point)
app/orchestrator/smart_message_router.py-
Note: the router model is configurable; in current code it’s instantiated with
gemini-2.5-flash-lite. -
Tool execution (when needed)
app/orchestrator/tool_executor.py- Web/current-info routing:
app/orchestrator/smart_router.py+app/orchestrator/query_analyzer.py -
Search providers:
app/orchestrator/providers/(Perplexity / Parallel) -
Response generation
app/orchestrator/response_generator.py(persona-styled, multi-bubble)-
LLM wrapper:
app/utils/llm_client.py(OpenAI GPT‑5 + optional Gemini) -
Persistence + analytics
- DB models:
app/models/database.py - Analytics:
app/orchestrator/analytics_tracker.py,app/analytics/
3) Edge WebSocket “reflex first” delivery¶
When the edge relay maintains a WebSocket connection (/edge/ws), the backend can push the first bubble immediately:
- Edge WebSocket manager:
app/edge/websocket_manager.py - Protocol + auth details:
docs/architecture/WEBSOCKET_PROTOCOL.md
This is used to deliver a fast first bubble (reflex) while the full response is still generating.
4) Key schemas¶
- Orchestrator (legacy):
app/models/schemas.py::OrchestratorRequest,OrchestratorResponse - Edge relay:
app/edge/schemas.py::FilteredMessage,EdgeCommand,CommandAck
5) Quick debugging map¶
- Ingress logic:
app/api/edge_routes.pyandapp/main.py - Routing decisions:
app/orchestrator/smart_message_router.py - Tool routing (Perplexity/Parallel/Free APIs):
app/orchestrator/smart_router.py - Memory selection:
app/memory/__init__.py