Query / Message Processing Architecture¶

Last Verified: February 2026
Scope: Backend message ingest + orchestration pipeline (this repo)

This backend processes user messages primarily via the Mac mini edge relay (../archety-edge) and optionally via a legacy relay integration.

1) Ingress entry points¶

Who calls it: Mac mini edge relay (../archety-edge)
Auth: Authorization: Bearer <edge_token> (see app/edge/auth.py)
Input: FilteredMessage (supports both legacy + modern shapes; see app/edge/schemas.py)
Why it exists: the edge relay pre-filters/redacts and can maintain local context/scheduling.

Implementation: app/api/edge_routes.py

Implementation: app/main.py

Both ingress paths ultimately route into: - app/orchestrator/message_handler.py::MessageHandler.handle_message_async()

At a high level:

Fast reflex (optional): regex/deterministic “ack” to reduce perceived latency
app/messaging/fast_reflex_detector.py
app/messaging/reflex_library.py
Safety + gating
Subscription/message limits: app/services/subscription_access_service.py
Output moderation: app/safety/ (called from ingress handlers)
Context build
Conversation history: app/orchestrator/conversation_history_service.py
Memory retrieval: app/memory/ (SupermemoryService default; mem0 fallback via get_memory_service())
Relationship state: app/orchestrator/relationship_service_sql.py
Routing (single decision point)
app/orchestrator/smart_message_router.py
Note: the router model is configurable; in current code it’s instantiated with gemini-2.5-flash-lite.
Tool execution (when needed)
app/orchestrator/tool_executor.py
Web/current-info routing: app/orchestrator/smart_router.py + app/orchestrator/query_analyzer.py
Search providers: app/orchestrator/providers/ (Perplexity / Parallel)
Response generation
app/orchestrator/response_generator.py (persona-styled, multi-bubble)
LLM wrapper: app/utils/llm_client.py (OpenAI GPT‑5 + optional Gemini)
Persistence + analytics
DB models: app/models/database.py
Analytics: app/orchestrator/analytics_tracker.py, app/analytics/

When the edge relay maintains a WebSocket connection (/edge/ws), the backend can push the first bubble immediately:

This is used to deliver a fast first bubble (reflex) while the full response is still generating.

Orchestrator (legacy): app/models/schemas.py::OrchestratorRequest, OrchestratorResponse
Edge relay: app/edge/schemas.py::FilteredMessage, EdgeCommand, CommandAck

Ingress logic: app/api/edge_routes.py and app/main.py
Routing decisions: app/orchestrator/smart_message_router.py
Tool routing (Perplexity/Parallel/Free APIs): app/orchestrator/smart_router.py
Memory selection: app/memory/__init__.py