PRD — Memory System¶

Doc owner: Justin Audience: Eng, Design, Product, Security Status: v2 (February 2026 — all mem0 references replaced with Supermemory; multi-query retrieval documented; Event Tracker documented; nightly dedup specified; provider-agnostic forgetting/boundary interface; future_event memory type added; latency metrics split raw vs pipeline; GDPR mapped to endpoints; Section 6 rewritten from migration plan to current architecture) Depends on: PRD 3 (Progression & Loadout)

v1 changelog: Initial memory system spec — dual storage, namespace isolation, 8 memory types, classification pipeline, recall, forgetting, quality metrics, GDPR compliance

Implementation Status¶

Section	Status	Notes
Supermemory semantic search	✅ Shipped	`app/memory/supermemory_service.py` — <300ms raw latency
Multi-query retrieval (Tolan-inspired)	✅ Shipped	`app/memory/query_synthesizer.py` — GPT-5-nano auxiliary question generation
Namespace isolation	✅ Shipped	Service-layer enforcement, not prompt-level
Memory classification pipeline	✅ Shipped	Piggybacked on response generation, no extra LLM call
Photo memory extraction (GPT-5 Vision)	✅ Shipped	Visual memory pipeline with trait extraction
"Forget that" command	✅ Shipped	Soft-delete with 30-day hard-delete
Nightly deduplication	✅ Shipped	`app/memory/memory_maintenance.py` — runs at 3am, cosine 0.85 threshold
Event Tracker ("I Remember")	✅ Shipped	`app/memory/event_tracker.py` — future event detection + 3-touch proactive sequence
Persona memory (JSON + semantic)	✅ Shipped	Dual storage for persona facts
GDPR data export	✅ Shipped	`app/api/account_routes.py` — JSON export with provenance
Boundary enforcement	✅ Shipped	`boundary_manager.py` — service-layer, provider-agnostic
Group memory extraction-only policy	❌ Not Shipped	Needs `logistics_only` filter (see PRD 5 v2)
Memory health dashboard	❌ Not Shipped	No admin UI for memory metrics
Confidence decay	❌ Not Shipped	Old memories don't decay in priority (Phase 2 candidate)
Memory pinning	❌ Not Shipped	No user-pinned high-priority memories (Phase 2 candidate)

References¶

This PRD uses standardized terminology, IDs, pricing, and model references defined in the companion documents:

Document	What it Covers
REFERENCE_GLOSSARY_AND_IDS.md	Canonical terms: workflow vs miniapp vs superpower, ID formats
REFERENCE_PRICING.md	Canonical pricing: $7.99/mo + $50/yr, free tier limits
REFERENCE_MODEL_ROUTING.md	Pipeline stage → model tier mapping
REFERENCE_DEPENDENCY_GRAPH.md	PRD blocking relationships and priority order
REFERENCE_FEATURE_FLAGS.md	All feature flags by category
REFERENCE_TELEMETRY.md	Amplitude event catalog and gaps

Executive Summary¶

Memory is Ikiro's core differentiator. Every other AI assistant forgets you between sessions. Sage remembers your inside jokes, your roommate's name, your exam schedule, your fear of deep water, and the time you cried about your parents' divorce. This PRD is the unified specification for how memories are created, classified, stored, searched, recalled, and deleted.

The system has three distinct memory domains — User Memories (what Sage knows about you), Persona Memories (what Sage knows about herself), and Group Memories (what Sage knows about a group's plans) — each with strict namespace isolation. Memories carry provenance (where they came from), sensitivity flags (how carefully to handle them), and validity windows (when they expire).

Current state: Supermemory-powered semantic search operational at <300ms raw latency (25x faster than previous mem0 system). Multi-query retrieval via auxiliary question synthesis improves recall breadth. Namespace isolation enforced at service layer. Photo memory extraction live. "Forget that" command functional. Event Tracker detects future events for proactive surfacing. Nightly deduplication runs at 3am. Persona memory system with JSON + Supermemory dual storage deployed.

1) Memory Taxonomy¶

1.1 User Memories (About the User)¶

Type	Examples	Sensitivity	Typical Lifespan
Emotional	"scared about the job interview," "fought with roommate," "cried about parents"	High	Indefinite (decays in recall priority, never auto-deleted)
Factual	"studies CS at UCLA," "allergic to peanuts," "drives a Honda Civic"	Medium	Indefinite
Deadline	"STAT210 quiz on Nov 7," "rent due the 1^st," "internship app deadline March 15"	Medium	Until `valid_to` passes
Inside Joke	"the 'quack' thing from the duck incident," "calling Thursdays 'struggle bus day'"	Low	Indefinite
Preference	"hates being asked 'how are you,'" "prefers morning check-ins," "vegetarian"	Low	Indefinite
People	"best friend Kai," "boss named Maria," "toxic ex named Jake"	High	Indefinite
Visual	"photo of sunset at Malibu," "messy bookshelf in dorm room"	Low-Medium	Indefinite
Future Event	"bar exam on March 20," "friend's wedding June 12," "job interview next Thursday"	Medium	Until event passes (then deprioritized, not deleted). Proactively resurfaced via Event Tracker.
Boundary	"don't bring up my weight," "stop checking my calendar"	Critical	Until explicitly revoked

1.2 Persona Memories (About Sage/Echo)¶

Type	Examples	Source
Experience	"spent a summer at a coffee shop," "once tried to learn guitar"	JSON file (authored)
Preference	"thinks pineapple pizza is amazing"	JSON file (authored)
Opinion	"believes people overthink career decisions"	JSON file (authored)
Learned Fact	"someone shared they're afraid of deep water"	Conversation (auto-classified, anonymized)
Interest	"obsessed with true crime podcasts"	JSON file (authored)

1.3 Group Memories (About Group Plans)¶

Type	Examples	Lifespan
Plan State	"Saturday dinner at 7pm, Kai is driving"	Until event passes
Poll Results	"voted 7pm: Jess, Kai. voted 8pm: Marcus"	Until poll closed + 7 days
Checklist	"Vegas list: chargers (Kai), gum (Jess)"	Until trip ends
Commitment	"Marcus said he'd make the reservation"	Until fulfilled or expired
Group Boundary	"User A: don't bring up my work stuff"	Indefinite

Group memory extraction policy: Group memories use an extraction-only policy — only logistics-relevant content (plan details, poll results, checklists, roles, logistics facts) is stored. Emotional disclosures, casual banter, and sensitive personal content from group members are seen but NOT extracted. See PRD 5 v2 Section 4.1 for the full policy. Gated by group_memory_logistics_only feature flag.

2) Memory Lifecycle¶

2.1 Creation¶

Automatic extraction (primary path):

User message → Intent/Emotion classifier → Memory classifier → Store if significant

The memory classifier runs on every user message (piggybacked on the response generation, no extra LLM call). It evaluates: - Is this memorizable? (factual, emotional, deadline, preference, or noise?) - Importance score (1-10) — must be ≥4 to store - Memory type — which category from the taxonomy - Sensitivity level — low/medium/high/critical - Existing memory check — does this duplicate or update an existing memory?

Manual creation paths: - Photo upload → GPT-5 Vision extracts visual memories - OAuth data → Calendar events become deadline memories, email patterns become people memories - Admin panel → Direct memory injection for persona memories

2.2 Classification Pipeline¶

class MemoryClassifier:
    async def classify(self, message: str, context: ConversationContext) -> MemoryCandidate | None:
        # Stage 1: Quick filter (regex + keyword, no LLM)
        if self.is_noise(message):  # "lol", "ok", "haha", single emoji
            return None

        # Stage 2: LLM classification (piggybacked on response generation)
        classification = await self.llm_classify(message, context)

        # Stage 3: Deduplication check
        existing = await self.search_similar(classification.text, threshold=0.85)
        if existing:
            return self.merge_or_skip(existing, classification)

        # Stage 4: Importance threshold
        if classification.importance < 4:
            return None

        return classification

Deduplication: Before storing, search Supermemory for semantically similar memories (cosine similarity >0.85). If found: - Same fact, same timeframe → skip (duplicate) - Same topic, new information → update existing memory - Same topic, contradicting information → store new, flag old as potentially outdated

2.3 Storage¶

Dual storage model:

[Supermemory (Semantic Search)]    [Supabase PostgreSQL (Structured)]
├── Vector embeddings              ├── Memory metadata
├── Semantic search index          ├── Provenance records
├── Full text content              ├── Sensitivity flags
└── Namespace isolation            ├── Validity windows
                                   ├── Boundary records
                                   └── Audit trail

Why dual? Supermemory excels at semantic search ("find memories about the user's family") with <300ms latency. PostgreSQL excels at structured queries ("all memories created this week," "all boundaries for user X," "all high-sensitivity memories for GDPR export").

Namespace strategy: | Context | Namespace Pattern | Example | |---------|------------------|---------| | User direct chat | user_{phone}_persona_{persona_id} | user_+15551234567_persona_sage | | Persona's own memories | persona_life_{persona_id} | persona_life_sage | | Group chat | group_{chat_guid} | group_iMessage_abc123 |

2.4 Recall (Search)¶

When memories are recalled: Every user message triggers a memory search. The orchestrator queries Supermemory with the user's message + recent conversation context as the search query.

Search parameters:

async def recall_memories(user_id: str, persona_id: str, query: str, mode: str) -> list[Memory]:
    namespace = f"user_{user_id}_persona_{persona_id}" if mode == "direct" else f"group_{chat_guid}"

    results = await supermemory.search(
        query=query,
        namespace=namespace,
        limit=10,
        threshold=0.6  # minimum relevance score
    )

    # Post-filter
    results = filter_by_validity(results)      # Remove expired memories
    results = filter_by_boundary(results)       # Respect "forget" commands
    results = rank_by_recency_and_relevance(results)  # Recent + relevant first

    return results[:5]  # Top 5 injected into context

Context injection: Top 5 recalled memories are injected into the system prompt as a "WHAT YOU KNOW ABOUT THIS PERSON" section. The LLM decides naturally whether and how to reference them.

Persona memory recall: Separately, 3-5 relevant persona memories are injected as "ABOUT YOU" section — enabling the companion to reference their own experiences naturally.

2.4.1 Multi-Query Retrieval (Tolan-Inspired)¶

Single-query search often misses relevant memories when the user's message is ambiguous or touches multiple topics. The multi-query retrieval pipeline synthesizes auxiliary search queries for broader recall coverage.

Pipeline:

User message: "i'm so nervous about tomorrow"
  → GPT-5-nano generates 2-3 auxiliary questions:
      Q1: "What events does the user have coming up?"
      Q2: "Has the user expressed anxiety or nervousness before?"
      Q3: "What deadlines or stressful situations has the user mentioned?"
  → Parallel Supermemory searches (Q_original + Q1 + Q2 + Q3)
  → Deduplicate results across all queries (cosine similarity >0.9 = same memory)
  → Rank by combined relevance score + recency
  → Top 5 injected into context

Implementation: app/memory/query_synthesizer.py generates auxiliary questions using GPT-5-nano (~50ms). All queries run in parallel against Supermemory. Results are deduplicated and re-ranked.

Performance: Multi-query adds ~100-150ms to the recall pipeline (synthesis + parallel search + dedup) but significantly improves recall breadth. The total multi-query pipeline target is <500ms (p50), <800ms (p95). Can be disabled via enable_multi_query_retrieval feature flag for latency-critical paths.

When multi-query fires: All 1:1 conversations. Group mode uses single-query only (logistics queries are typically unambiguous, and latency matters more in fast-moving group chats).

2.5 Forgetting (Deletion)¶

User-initiated: - "Forget that" → Soft-delete the most recent memory or the memory most relevant to the preceding conversation - "Forget everything about [topic]" → Search and soft-delete all memories matching topic - "Stop checking my calendar" → Revoke OAuth + store boundary + delete OAuth-derived memories

Provider-agnostic deletion interface:

The forgetting system is defined as a service-layer contract, not coupled to any specific storage provider. All deletion operations go through the MemoryService interface:

class MemoryService:
    async def delete_by_id(self, memory_id: str) -> bool:
        """Soft-delete a specific memory. Returns True if found and deleted."""

    async def delete_by_topic(self, user_id: str, topic: str) -> int:
        """Search and soft-delete all memories matching topic. Returns count deleted."""

    async def delete_all(self, user_id: str) -> int:
        """Soft-delete all memories for a user (account deletion path). Returns count."""

    async def hard_delete_expired(self) -> int:
        """Hard-delete all memories past the 30-day grace period. Called by nightly job."""

Soft delete behavior (enforced regardless of provider): - Soft-deleted memories are excluded from all search results immediately - Hard-deleted after 30 days (GDPR compliance window) - User can request immediate hard deletion via account settings (POST /account/delete) - Deletion context is logged in the audit trail for compliance

System-initiated: - Deadline and future_event memories auto-expire when valid_to passes (not deleted, just deprioritized in recall) - Group memories for completed events expire after 7 days - No automatic deletion of emotional or factual memories (users control their own data)

2.6 Nightly Deduplication¶

Implementation: app/memory/memory_maintenance.py runs at 3am UTC via the background scheduler.

Algorithm: 1. For each active user, retrieve all non-deleted memories from Supermemory 2. Compute pairwise cosine similarity between memory embeddings 3. Memories with similarity >0.85 (memory_dedup_similarity_threshold) are candidates for merging 4. Conflict resolution: - Same fact, same timeframe → keep the newer one, delete the older - Same topic, different details → merge into a single updated memory (newer detail wins) - Emotional memories with different contexts → keep both (emotional memories are context-sensitive) - Contradicting facts → keep the newer one, mark the older as superseded 5. Log all dedup actions to memory_audit table and fire memory_deduplicated telemetry event

Performance: Per-user batching. Average user with ~50 memories completes in <2 seconds. Users with 500+ memories may take 10-15 seconds. Non-blocking — does not affect real-time conversation latency.

Safeguards: Boundary memories are never deduplicated. Pinned memories (future) are never deduplicated. The dedup job is gated by enable_memory_deduplication feature flag.

3) Memory Quality & Evaluation¶

3.1 Recall Quality Metrics¶

Metric	Target	Measurement
Precision@5	≥0.8	Of the top 5 recalled memories, ≥4 are relevant to the query
Recall accuracy	≥85%	Semantic similarity between recalled fact and ground truth
Raw search latency (p50)	<300ms	Single Supermemory search (no multi-query)
Raw search latency (p95)	<600ms	Single Supermemory search (no multi-query)
Multi-query pipeline latency (p50)	<500ms	Full pipeline: synthesis + parallel search + dedup + rank
Multi-query pipeline latency (p95)	<800ms	Full pipeline end-to-end
False positive rate	<10%	Memories recalled that are irrelevant or outdated
Boundary violation rate	0%	Zero tolerance — deleted/boundary memories must never appear

3.2 Evaluation Framework¶

Seeded eval set: 200 test scenarios with known-correct memory recalls:

{
  "scenario": "User asks about upcoming deadlines",
  "user_message": "what do I have due this week?",
  "expected_memories": ["STAT210 quiz Thursday", "essay draft due Friday"],
  "should_not_recall": ["birthday party last month", "fight with roommate"]
}

Automated testing: Run eval set nightly against production Supermemory instance. Alert if Precision@5 drops below 0.75.

Manual review: Weekly sample of 50 real conversations, scored by: - Did Sage reference a memory when she should have? - Was the memory reference accurate? - Was it contextually appropriate (not forced or awkward)? - Did Sage avoid referencing memories she shouldn't have?

3.3 Memory Health Dashboard (Admin)¶

Total memories per user (distribution)
Memory type breakdown (emotional vs factual vs deadline)
Deduplication rate (% of candidate memories that were duplicates)
Recall hit rate (% of conversations where at least one memory was recalled)
Boundary enforcement audit (verify no deleted memories appeared in responses)
Stale memory count (memories older than 90 days never recalled)

4) Privacy Model¶

4.1 Sensitivity Levels¶

Level	Examples	Handling
Critical	Boundaries, OAuth revocations, explicit "don't mention" requests	Hard enforcement — checked before every response. Violation = P0 bug.
High	Emotional disclosures, mental health, relationships, financial stress	Referenced only in 1:1. Never in groups. Never in logs without redaction.
Medium	Schedule, deadlines, school/work info, people's names	Referenced in 1:1. Not in groups unless user explicitly shares.
Low	Preferences, inside jokes, general interests	Can be referenced freely in 1:1. Inside jokes never leak to groups.

4.2 Namespace Isolation (Enforced at Service Layer)¶

class MemoryService:
    async def search(self, user_id, persona_id, query, mode, chat_guid=None):
        if mode == "group":
            # ONLY search group namespace — never user's private memories
            namespace = f"group_{chat_guid}"
        elif mode == "direct":
            # Search user's private namespace
            namespace = f"user_{user_id}_persona_{persona_id}"
        else:
            raise ValueError(f"Unknown mode: {mode}")

        # This is the ONLY place namespace is determined
        # The orchestrator and persona engine cannot override this
        return await self.supermemory.search(query=query, namespace=namespace, limit=10)

Design principle: Namespace isolation is enforced at the service layer, not the prompt layer. Even if the LLM is instructed to ignore boundaries, it physically cannot access memories outside its namespace because the search API never returns them.

Right	Implementation	Endpoint
Right to access	Export all memories as JSON with provenance metadata	`GET /account/data-export` (`app/api/account_routes.py`)
Right to rectification	User says "actually that's wrong" → memory updated. Admin panel override for manual corrections.	Conversational (orchestrator). Admin: `PUT /memories/{id}` (future)
Right to erasure	"Forget everything" → soft-delete all → hard-delete at 30 days. Account deletion → immediate hard-delete of all data.	`DELETE /memories/all` (soft) / `POST /account/delete` (hard, 30-day grace)
Right to portability	Same export as Right to access — JSON with provenance, sensitivity, timestamps	`GET /account/data-export`
Right to object	Boundaries system — "don't track my [topic]" creates a `user_boundary` record that blocks future extraction and recall	Conversational ("stop tracking X"). API: `POST /boundaries` (future)

4.4 Persona Memory Privacy¶

When Sage learns from conversations, strict anonymization applies:

User Said	Sage Stores
"John at Google told me about the layoffs"	"Had a conversation about tech industry layoffs"
"I'm terrified of deep water"	"Someone shared they're afraid of deep water"
"My therapist says I should journal"	"Heard that journaling can be therapeutic"
"I live at 123 Main St"	NOT STORED (PII filter catches address patterns)

5) Memory-Driven Features¶

5.1 Persona Modifiers (PRD 3 Integration)¶

Before each conversation, derive_persona_modifiers() analyzes the user's memory profile and injects behavioral instructions. See PRD 3 Section 3 for full specification.

5.2 Emotional Hook Rate¶

Target: ≥50% of replies reference a past shared moment. Tracked by:

def has_memory_reference(response: str, recalled_memories: list) -> bool:
    # Check if response contains natural reference to any recalled memory
    # Uses embedding similarity between response and memory content
    for memory in recalled_memories:
        if cosine_similarity(embed(response), embed(memory.text)) > 0.7:
            return True
    return False

5.3 Inside Joke Detection¶

Inside jokes are detected and stored with special handling: - Triggered when a callback to a shared moment gets a positive reaction (laughter, "lmao," "I can't") - Stored with type: "inside_joke" and high recall priority - Referenced at higher frequency than other memory types (drives perceived intimacy) - Weighted 5x in the progression formula (PRD 3)

5.4 Photo Memory Pipeline¶

User sends photo in iMessage
  → Edge agent forwards to backend
  → GPT-5 Vision analysis:
      - Scene description
      - Objects detected
      - People (count, not identity)
      - Mood/emotion of scene
      - Location cues
  → Memory classifier determines if memorable
  → Store as visual memory with extracted context
  → Companion acknowledges: "saved! I'll remember this"
  → Later recall: "remember that sunset photo you sent? that was gorgeous"

5.5 Event Tracker ("I Remember" Proactivity)¶

Implementation: app/memory/event_tracker.py

The Event Tracker detects future events mentioned in conversation and stores them as type: future_event memories with proactive surfacing schedules. This bridges PRD 7 (memory) and PRD 10 (proactive intelligence).

Detection: During memory classification, the classifier identifies future-looking statements: - "my bar exam is on March 20" - "friend's wedding is June 12" - "job interview next Thursday" - "finals start in two weeks"

These are stored with type: future_event, an importance_score (1-10 based on emotional weight), and a valid_to date (the event date).

3-Touch Proactive Sequence ("Hype Person"):

For high-importance events (importance ≥7), the Event Tracker schedules a 3-touch sequence via the proactive scheduler (PRD 10):

Touch 1 — Day before: "hey just so you know i'm thinking about
           your [event] tomorrow. you've got this 💪"

Touch 2 — Morning of: "today's the day! go crush your [event].
           i'm rooting for you 🎉"

Touch 3 — Day after:  "sooo how'd [event] go?? tell me everything"

Each touch is companion-specific (Sage = warm encouragement, Vex = hype energy, Echo = calm support). The sequence is non-blocking — if the user messages organically before a touch fires, the companion weaves the event into natural conversation instead.

Lower-importance events (importance 4-6): Single-touch reminder the day before, no follow-up sequence.

Telemetry: event_tracker_detected (detection) and event_tracker_resurfaced (proactive touch fired).

6) Current Architecture & Optimization Roadmap¶

6.1 Current: Supermemory (Shipped)¶

Provider: Supermemory cloud
Performance: <300ms raw search latency (p50), <500ms multi-query pipeline (p50)
Strengths: 25x faster than previous mem0 system, managed infrastructure, namespace isolation, semantic search with high recall accuracy
Migration: Completed. mem0 exists as legacy behind a feature flag (enable_legacy_mem0) but is not used in production.
Cost: ~$0.15/user/month at current usage patterns

6.2 Architecture¶

[Supermemory (Semantic Search)]    [Supabase PostgreSQL (Structured)]    [Background Jobs]
├── Vector embeddings               ├── memory_metadata table             ├── Nightly dedup (3am)
├── Semantic search index            ├── user_boundaries table            ├── Event Tracker scheduling
├── Full text content                ├── memory_audit table               ├── Hard-delete expired (30d)
├── Namespace isolation              ├── Provenance records               └── Memory health checks
└── Multi-query parallel search      └── GDPR operations

6.3 Optimization Roadmap (Future)¶

Optimization	Trigger	Benefit
pgvector in Supabase	Supermemory costs exceed $500/month or need sub-100ms recall	Self-hosted vector search, eliminates external dependency
Redis hot-cache	P95 latency exceeds 800ms	Cache last 10 recalled memories per user, 5-min TTL, reduces repeat searches
Graph relationships	Memory-driven features need "related memories" traversal	"roommate fight → apartment stress → deadline anxiety" as connected graph
Tiered storage	Users with 1000+ memories	Hot memories (recalled in last 30d) in fast tier, cold memories in cheap storage

7) Data Model¶

Core Tables¶

-- Memory metadata (Supabase — complements Supermemory)
CREATE TABLE memory_metadata (
    id UUID PRIMARY KEY,
    supermemory_id TEXT,  -- Reference to Supermemory record
    user_id UUID REFERENCES users(id),
    namespace TEXT NOT NULL,
    memory_type TEXT CHECK (memory_type IN ('emotional', 'factual', 'deadline', 'future_event', 'inside_joke', 'preference', 'people', 'visual', 'boundary', 'persona_experience', 'persona_opinion', 'group_plan')),
    sensitivity TEXT CHECK (sensitivity IN ('low', 'medium', 'high', 'critical')),
    importance INTEGER CHECK (importance BETWEEN 1 AND 10),
    content_preview TEXT,  -- First 100 chars (for admin UI)
    provenance JSONB,  -- {source, snippet, confidence}
    valid_from TIMESTAMPTZ,
    valid_to TIMESTAMPTZ,
    pii_tags TEXT[] DEFAULT '{}',
    recall_count INTEGER DEFAULT 0,
    last_recalled_at TIMESTAMPTZ,
    created_at TIMESTAMPTZ DEFAULT now(),
    deleted_at TIMESTAMPTZ,
    deleted_by TEXT  -- 'user_command', 'system_expiry', 'admin', 'gdpr_request'
);

-- Boundary records (enforced before every response)
CREATE TABLE user_boundaries (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(id),
    scope TEXT CHECK (scope IN ('global', 'group', 'topic')),
    chat_guid TEXT,  -- NULL for global boundaries
    boundary_text TEXT NOT NULL,
    boundary_type TEXT CHECK (boundary_type IN ('forget_topic', 'revoke_oauth', 'no_proactive', 'custom')),
    is_active BOOLEAN DEFAULT true,
    created_at TIMESTAMPTZ DEFAULT now()
);

-- Memory audit trail
CREATE TABLE memory_audit (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    memory_id UUID,
    action TEXT CHECK (action IN ('created', 'recalled', 'updated', 'soft_deleted', 'hard_deleted', 'exported')),
    actor TEXT,  -- 'system', 'user', 'admin'
    context TEXT,
    created_at TIMESTAMPTZ DEFAULT now()
);

8) API Endpoints¶

Endpoint	Auth	Purpose
`GET /memories`	User	List user's memories (paginated, filterable by type/date)
`GET /memories/search`	User	Semantic search across user's memories
`GET /memories/export`	User	GDPR export — all memories as JSON
`DELETE /memories/{id}`	User	Soft-delete specific memory
`DELETE /memories/topic/{topic}`	User	Forget all memories about a topic
`DELETE /memories/all`	User	Soft-delete everything (account deletion path)
`GET /memories/health`	Admin	Memory system health dashboard data
`GET /memories/eval`	Admin	Run evaluation suite, return metrics

9) Success Metrics¶

Metric	Target	Why
Recall accuracy	≥85%	Core quality bar
Precision@5	≥0.8	Relevant memories surfaced
Emotional hook rate	≥50% of responses reference a memory	Drives perceived intimacy
Raw search latency (p50)	<300ms	Single Supermemory search baseline
Multi-query pipeline latency (p50)	<500ms	Full recall pipeline including auxiliary question synthesis
Boundary violation rate	0%	Zero tolerance
Memory per active user (30d)	30-100	Healthy memory accumulation
Deduplication rate	>80%	Not storing redundant memories
"Forget that" success rate	100%	User trust

Feature Flags & Gating¶

Flag Key	Default	Purpose
`enable_memory_system`	`true`	Master switch for memory storage and recall
`memory_importance_threshold`	`4`	Minimum importance score (1-10) to store
`memory_search_limit`	`10`	Max memories returned per search
`memory_top_k_injected`	`5`	Top-K memories injected into context
`enable_multi_query_retrieval`	`true`	Tolan-inspired auxiliary question synthesis
`enable_event_tracker`	`true`	Proactive "I Remember" event detection
`enable_memory_deduplication`	`true`	Nightly deduplication at 3am
`enable_photo_memory`	`true`	GPT-5 Vision memory extraction
`memory_dedup_similarity_threshold`	`0.85`	Cosine similarity for dedup matching

See REFERENCE_FEATURE_FLAGS.md for the full catalog.

Telemetry¶

Event	Trigger	Properties
`memory_created`	New memory stored	`user_id`, `memory_type`, `importance`, `sensitivity`
`memory_recalled`	Memory surfaced in conversation	`user_id`, `memory_id`, `relevance_score`, `age_days`
`memory_deleted`	User says "forget that"	`user_id`, `memory_id`, `deletion_type` (specific/topic/all)
`memory_deduplicated`	Nightly dedup removes duplicate	`user_id`, `original_id`, `duplicate_id`, `similarity`
`memory_search`	Recall query executed	`user_id`, `query_count` (multi-query), `results_count`, `latency_ms`
`event_tracker_detected`	Future event identified in conversation	`user_id`, `event_type`, `importance_score`
`event_tracker_resurfaced`	"I Remember" proactively recalls event	`user_id`, `event_id`, `days_until_event`
`photo_memory_extracted`	Photo processed into memory	`user_id`, `traits_extracted`, `processing_time_ms`
`boundary_set`	User sets a "don't mention X" boundary	`user_id`, `boundary_type`, `scope`
`boundary_enforced`	Boundary prevented a recall	`user_id`, `boundary_id`, `blocked_memory_id`

Needed but not yet tracked: - memory_health_check — periodic system-wide recall quality metrics - ~~memory_cap_reached~~ — Removed. PRD 12 v2 eliminated the memory cap. Memory is unlimited for all users.

See REFERENCE_TELEMETRY.md for the full event catalog.

Definition of Done¶

10) Open Questions¶

Resolved¶

~~When should we migrate from mem0 to Supermemory or self-hosted?~~ Resolved. Migration to Supermemory completed. mem0 is legacy behind enable_legacy_mem0 flag. See Section 6 for optimization roadmap if Supermemory costs exceed thresholds.
~~How do we handle contradictory memories?~~ Resolved. Nightly deduplication (Section 2.6) handles this: contradicting facts → keep the newer one, mark the older as superseded. Real-time classification also flags contradictions at ingest (Section 2.2).

Phase 2 Candidates¶

Confidence decay: Should old memories be recalled with less certainty over time? Currently all memories have equal recall weight regardless of age. Decay would deprioritize stale memories in search ranking without deleting them. Tracked in Implementation Status as Phase 2 candidate.
Memory pinning: Should users be able to "pin" important memories that always have high recall priority? Would require UI in a future memory viewer. Tracked in Implementation Status as Phase 2 candidate.

Still Open¶

Should the memory viewer show everything, or curate a "highlights" view? (Depends on admin dashboard timeline — Phase 7.)
How do we handle memories about other people? ("My friend Kai is going through a breakup" — is this Kai's data or the user's?) Current approach: stored as the user's memory about their social context, with people type. If Kai is also an Ikiro user, their own namespace is separate. No cross-user deduplication.