Skip to content

PRD — Composable MiniApp System (Superpowers Engine)

Doc owner: Justin Audience: Eng, Design, Product Status: v3 (February 2026 — All 13 analysis recommendations implemented) Depends on: prd.md (core platform), COMPOSABLE_MINIAPP_SYSTEM.md (implementation reference)


Implementation Status

Section Status Notes
Config-driven JSON superpowers ✅ Shipped app/miniapps/generator/app_generator.py — 15 operation types, 45 filters
Multi-turn creation flow ✅ Shipped app/miniapps/generator/creation_handler.py — confirm/refine/cancel
First-party miniapps (Trip Planner) ✅ Shipped app/miniapps/handlers/trip_planner.py
Bill Split miniapp ✅ Shipped app/miniapps/handlers/bill_split.py
Block-based UI system ✅ Shipped Header, navigation, content blocks, stats grids, card lists
Capability hints ✅ Shipped 7-day cooldown, contextual discovery
Dry-run testing ⚠️ Partial Mock responses implemented, needs UX polish
completion_criteria & escalation ⚠️ Specified Schema defined, runtime evaluation pending
Version pinning ❌ Not Shipped No superpower versioning system
Idempotency keys ❌ Not Shipped No replay protection on operations
Fork/remix from marketplace ❌ Not Shipped Depends on PRD 2

References

This PRD uses standardized terminology, IDs, pricing, and model references defined in the companion documents:

Document What it Covers
REFERENCE_GLOSSARY_AND_IDS.md Canonical terms: workflow vs miniapp vs superpower, ID formats
REFERENCE_PRICING.md Canonical pricing: $7.99/mo + $50/yr, free tier limits
REFERENCE_MODEL_ROUTING.md Pipeline stage → model tier mapping
REFERENCE_DEPENDENCY_GRAPH.md PRD blocking relationships and priority order
REFERENCE_FEATURE_FLAGS.md All feature flags by category
REFERENCE_TELEMETRY.md Amplitude event catalog and gaps

Executive Summary

The Composable MiniApp System is the runtime engine that powers all "superpowers" inside Ikiro. It lets anyone — the team, creators, or Sage herself — define sophisticated mini-applications purely through JSON configuration. No code required.

Users create superpowers by chatting with Sage: "build me an app that splits restaurant bills from a photo." Sage translates that into a structured JSON definition, which the GenericConfigHandler interprets at runtime. The result: a receipt scanner → math engine → Venmo link generator → group message sender — all from a conversation.

This PRD covers the technical platform only. For how superpowers are discovered and distributed, see PRD 2 (Marketplace). For how users earn, equip, and level up superpowers, see PRD 3 (Progression & Loadout).


Terminology

"Superpower" is the user-facing term for a custom mini-application. Technically, superpowers are implemented as MiniApp records in the database and interpreted by the GenericConfigHandler. The codebase uses "MiniApp" for technical consistency across first-party Python apps and user-created JSON configs. PRDs and user-facing features use "superpower" for marketing clarity.


1) What Exists Today

Platform Capabilities (All Shipped)

15 Operation Types:

Category Operations What They Do
State state_update, emit_event, emit_events Session data, analytics events (single/batch)
AI/LLM llm_extract, vision_analysis GPT-5 structured extraction, image analysis
Integrations gmail_search, calendar_query, web_search OAuth-scoped data access
Flow Control conditional, loop If/else branching, iteration over collections
UGC Composable calculate, conversation_flow, build_url, send_message, schedule_task Math, multi-turn dialogs, deep links, cross-chat DMs, deferred actions

45 Template Filters across collections, strings, numbers, dates, logic, types, and UGC-specific transforms.

State Machines with auto-transitions for multi-phase apps (e.g., scanning → splitting → settling).

Security Controls: AST-based math (no eval), domain allowlists for URLs, recipient validation for messages, rate limits, HMAC webhook verification, payload size caps, action/operation limits.

Infrastructure: Webhook ingestion, app analytics, versioning with rollback, starter templates, dry-run testing sandbox.

Test Coverage: 437 unit tests across 31 test files.

Three-Tier Hierarchy

Tier Description Examples
First-Party Python handlers with full framework access Trip Planner, Calendar Stress, Gmail Mind Reader
UGC Superpowers JSON definitions interpreted by GenericConfigHandler User-created apps via Sage conversations
Marketplace Apps Published UGC superpowers with admin review Community-shared apps

2) What's Next: Remaining Phases

Phase 9: Inter-App Communication (P2 — High Impact, High Effort)

Problem: Apps are isolated silos. A Bill Split app can't tell a Budget Tracker "hey, this person just paid $47 for dinner."

Solution: Event bus for typed inter-app events with explicit permission model.

Example: Bill Split emits bill_settled → Budget Tracker records expense automatically

Full specification: See PRD_1B_INTER_APP_EVENTS.md for: - Event schema registry - Permission model (manifest declarations) - Subscription matching and routing - Security boundaries - Testing and debugging


Phase 10: Richer UI Components (P1 — High Impact, Medium Effort)

Current State: Block-based UI system ships with header blocks, navigation, content blocks, stats grids, and card lists. First-party miniapps use structured view-based components.

What's Missing: Advanced interactive components for complex data visualization and input collection.

What to build: - Input components: Date pickers, number steppers, toggles, radio buttons (renderable in iMessage via rich link micro-pages) - Data visualization: Bar/pie/line charts for budget breakdowns, mood trends, habit streaks - Image gallery: For photo-based apps (receipt scanning results, trip mood boards) - Map embed: Location pins for trip planners, restaurant finders, venue suggestions - Approval dialogs: Confirm/cancel patterns for destructive or financial actions - Progress indicators: Multi-step flows show where you are (Step 2 of 4)

Rendering strategy: - iMessage: Generate web preview cards with interactive elements via hosted micro-pages (user taps link → mobile-optimized mini-UI → result flows back to chat) - Web portal: Native React components - Fallback: Always degrade to current text/block-based UI for pure chat contexts

UI block schema extension:

{
  "type": "chart",
  "chart_type": "bar|pie|line",
  "data": "{{$state.spending_by_category | json}}",
  "title": "This Week's Spending"
}


Phase 13: Cron Triggers (P3 — Medium Impact, Medium Effort)

Problem: Apps can only react to user messages or webhooks. No way to run something every morning at 7am or every Monday at 9am.

Solution: Reuse existing SchedulerService (from PRD 10 Proactive Intelligence) to support superpower cron schedules.

Implementation: Superpowers use the same APScheduler infrastructure as workflows with shared: - Redis distributed locking (multi-worker safe) - Circuit breakers (failure tracking) - Budget enforcement (cost control) - Timezone awareness

Schema:

{
  "triggers": [
    {
      "type": "cron",
      "schedule": "0 7 * * 1-5",
      "action_id": "weekday_morning_brief"
    },
    {
      "type": "system_event",
      "event": "relationship_stage_change",
      "action_id": "celebrate_milestone"
    }
  ]
}

Resource Limits: - Max 3 cron triggers per superpower (cost control) - Max 5 executions per day per superpower (prevents runaway costs) - Respects user quiet hours (no triggers during sleep) - Disabled by default (user must opt-in per superpower)

Data Model:

class SuperpowerCronSchedule(Base):
    superpower_id = Column(String, ForeignKey("mini_apps.id"))
    user_id = Column(UUID, ForeignKey("users.id"))
    action_id = Column(String)  # Which action to trigger
    cron_expression = Column(String)
    enabled = Column(Boolean, default=False)  # Opt-in required
    executions_today = Column(Integer, default=0)
    last_execution = Column(DateTime)

Interaction with Edge Agent: Cron triggers generate schedule_message commands sent to the Mac mini edge agent, which handles local execution and delivery.


3) Vibe-Coding: How Users Create Superpowers

The creation flow is conversational. Users never see or write JSON.

Flow: 1. User tells Sage what they want: "I want something that checks my email every morning and tells me if anything is urgent" 2. Sage asks clarifying questions: "want me to check just unread, or everything from the last 24h? should I flag things from your boss differently?" 3. Sage generates the JSON definition behind the scenes 4. Sage offers a test run: "try it now — say 'test morning email scanner' and I'll run it in sandbox mode" 5. User approves → app is saved to their superpower loadout 6. Optionally: user submits to marketplace for others to use

What Sage generates (keyword triggers only — cron is Phase 13):

{
  "name": "Morning Email Scanner",
  "description": "Scans your inbox for urgent emails on demand",
  "completion_criteria": "User has received a categorized list of urgent vs. non-urgent emails from the last 24h with at least one actionable suggestion",
  "escalation": {
    "on_oauth_failure": "Tell user their email connection expired and offer to reconnect",
    "on_empty_result": "Confirm inbox was checked successfully",
    "on_llm_extraction_failure": "Fall back to showing raw email subjects with senders"
  },
  "triggers": [
    {
      "type": "keyword",
      "phrases": ["check my email", "anything urgent", "scan inbox"],
      "action_id": "scan_inbox"
    }
  ],
  "actions": [
    {
      "id": "scan_inbox",
      "oauth_required": ["gmail"],
      "operations": [
        {
          "type": "gmail_search",
          "query": "is:unread newer_than:24h",
          "output": "emails"
        },
        {
          "type": "llm_extract",
          "input": "{{$op.emails}}",
          "output_schema": {
            "type": "object",
            "properties": {
              "urgent": {"type": "array"},
              "can_wait": {"type": "array"}
            }
          },
          "output": "sorted"
        },
        {
          "type": "conditional",
          "condition": "{{$op.sorted.urgent | length}} > 0",
          "then": [
            {
              "type": "state_update",
              "data": {"has_urgent": true}
            }
          ],
          "else": [
            {
              "type": "state_update",
              "data": {"has_urgent": false}
            }
          ]
        }
      ],
      "response": {
        "template": "{{$state.has_urgent | ternary:'🚨 You have {{$op.sorted.urgent | length}} urgent emails':'✨ Inbox is chill — nothing urgent in the last 24h'}}"
      }
    }
  ]
}

Note: Cron triggers ("type": "cron") are shown in examples above but won't be generated by Sage until Phase 13 ships.


Definition of Done & Escalation Rules (v2 — Schema Defined, Runtime Pending)

Every superpower must define two mandatory fields that govern quality and failure handling. This is inspired by agent role cards — the insight being that LLMs are smart enough to execute tasks, but they need explicit rules for when things go wrong and what "success" actually looks like.

{
  "name": "Morning Email Scanner",
  "completion_criteria": "User has received a categorized list of urgent vs. non-urgent emails from the last 24h with at least one actionable suggestion",
  "escalation": {
    "on_oauth_failure": "Tell user their email connection expired and offer to reconnect — don't retry silently or guess",
    "on_empty_result": "Confirm inbox was checked successfully, don't say 'no emails found' without verifying access worked",
    "on_ambiguous_input": "Ask one clarifying question about time range or sender — don't assume",
    "on_llm_extraction_failure": "Fall back to showing raw email subjects with senders instead of failing silently"
  },
  "actions": [...]
}

completion_criteria (required string): A plain-language statement of what constitutes a successful execution. The GenericConfigHandler evaluates this at the end of each action chain. If the response doesn't meet the criteria (e.g., the LLM extraction returned empty when emails existed), the handler retries once with a modified prompt before returning a degraded response. This prevents the common failure mode where a superpower runs to completion but produces garbage output.

escalation (required object): A map of failure scenarios to prescribed behaviors. Every key is a failure class, every value is what Sage should say or do. The handler checks escalation rules before generating error responses.

Standard escalation keys every superpower should address:

Key When It Fires Bad Default (without escalation)
on_oauth_failure Token expired, revoked, or scope insufficient Silent failure or cryptic error
on_empty_result Query returned no data "Nothing found" when the real issue is access
on_ambiguous_input User's request could mean multiple things Guessing wrong and acting on it
on_rate_limit External API rate limited Retry loop or timeout
on_partial_failure Some operations succeeded, some failed Showing partial results without context
on_llm_extraction_failure LLM couldn't parse structured data Empty response or hallucinated structure

Sage-generated superpowers auto-populate escalation rules based on the operations used. If an app uses gmail_search, Sage includes on_oauth_failure and on_empty_result automatically. Creators can override.

Guardrails on AI-generated apps: - All generated JSON is validated against the app schema before saving — including completion_criteria and at least 2 escalation rules - OAuth scopes must match what the user has already granted (or triggers consent flow) - Domain allowlist and security controls apply identically to AI-generated and hand-authored apps - Dry-run sandbox is mandatory before first save

Implementation Status: Schema defined in PRD v2. Runtime evaluation pending (see todo: build completion_criteria evaluation in GenericConfigHandler).


4) Platform Limits & Governance

Resource Limits (Per App)

Resource Limit Rationale
Actions 30 max Prevents unbounded complexity
Operations per action 20 max Keeps execution predictable
UI blocks per response 10 max Prevents message spam
Scheduled tasks per session 10 pending max Prevents queue abuse
Cron triggers per app 3 max Cost control (Phase 13)
Webhook endpoints per app 5 max Rate limit surface area
State size per session 64KB max Memory pressure
Template nesting depth 5 levels max Prevents infinite recursion

Operation-Specific Rate Limits

Operation Rate Limit Enforcement Location
send_message 5 messages per action execution SendMessageExecutor
schedule_task 10 pending tasks per session, 168-hour max delay ScheduleTaskExecutor
Cron triggers 5 executions per day per superpower SchedulerService (Phase 13)

Security Model

All UGC apps run inside the GenericConfigHandler sandbox. There is no escape to arbitrary code execution. The security boundaries are:

  • No eval/exec — math uses AST-based SafeMathParser
  • No network access except through declared operations (gmail_search, calendar_query, web_search, build_url)
  • No filesystem access — apps operate purely on session state
  • OAuth scopes are declared and enforced — an app can't access Gmail without gmail scope in its manifest and the user's granted consent
  • All outbound URLs are domain-allowlisted — only approved payment/map/calendar domains (see Domain Allowlist below)
  • All cross-chat messages are recipient-validated — only session participants
  • Webhook payloads are HMAC-verified and size-capped (64KB)

Domain Allowlist (for build_url operation)

The build_url operation restricts deep links to the following domains for security (prevents phishing via UGC superpowers):

Payment platforms: - venmo.com - cash.app - paypal.com

Navigation/Maps: - maps.google.com - calendar.google.com - maps.apple.com

Messaging: - wa.me (WhatsApp)

To request additions: Submit via internal security review process. New domains must be: 1. HTTPS-only 2. From reputable providers (not user-controlled) 3. No user-generated content in path/params that could enable XSS 4. Business justification (which superpowers need it?)

App Review (for Marketplace)

See PRD 2 (Marketplace) for the full review workflow. Key points: - Admin review required before public listing - Automated schema validation + security lint - Version-level review (each update goes through review) - Suspension mechanism for post-publish issues


5) Install Semantics

1:1 Conversations

  • Superpowers are per-user installs (UserCustomMiniAppInstall.user_id)
  • Only the user's installed superpowers can be triggered
  • Subscription required for access (see PRD 12)

Group Conversations

  • Superpowers are room-scoped (room owner's installs apply to all members)
  • Any member can trigger the room owner's installed superpowers
  • Requires room owner subscription (enforced via subscription_access_service.py)
  • Free users in a paid owner's group get full superpower access
  • Viral mechanic: Free users see superpowers in action → want to subscribe for their own groups

First-Party MiniApps (Polls, Checklists, Trip Planner)

  • Enabled as room capabilities (not individual installs)
  • Gated by room owner subscription
  • Always available to all room members

6) Dry-Run Testing Sandbox

Before saving a superpower, users can test it in sandbox mode with mock responses.

User Flow: 1. User creates superpower via chat with Sage 2. Sage says: "want to test it? say 'test morning email scanner'" 3. User triggers test run 4. Sage executes with dry_run=True, returns mock data 5. Sage: "that was a test run — no real actions happened. say 'save it' to make it real"

Implementation:

# In GenericConfigHandler
async def _execute_operation(self, op, context, dry_run=False):
    if dry_run:
        if op.type == "send_message":
            return {"sent": False, "preview": op.message}
        elif op.type == "gmail_search":
            return {"emails": [{"subject": "Sample Email", "from": "boss@company.com"}]}
        elif op.type == "vision_analysis":
            return {"detected": "receipt", "merchant": "Sample Restaurant", "total": 47.52}
        # ... mock responses for all 15 operation types
    else:
        # Real execution via operation_registry

Scope: - Works for all 15 operations with predefined mock responses - No external API calls made - No messages sent, no state persisted - User sees realistic preview without side effects

Current Status: Mock response framework exists. Needs UX polish (explicit "test mode" indicator in responses).


7) Operation Definition of Done

Each operation type has explicit acceptance criteria that must be validated before shipping:

calculate

  • ✅ Never uses eval — AST-based SafeMathParser only
  • ✅ Supports distribute mode (splits total evenly so shares sum exactly)
  • ✅ Handles division-by-zero gracefully (returns error, not crash)
  • ✅ Supports functions: round(), ceil(), floor(), min(), max(), sum(), len(), abs()
  • ✅ Returns numbers rounded to specified decimal places

send_message

  • ✅ Rate limited to 5 messages per action execution
  • ✅ Recipient validation (only session participants)
  • ✅ No message sent without explicit user consent in superpower flow
  • ✅ Respects user's notification preferences (quiet hours)
  • ✅ OAuth scope gmail required and enforced
  • ✅ Token expiry handled gracefully (escalation rule)
  • ✅ Query results returned in structured format
  • ✅ Empty results distinguished from access failures

vision_analysis

  • ✅ GPT-5 Vision model used
  • ✅ Image size limits enforced (max 20MB)
  • ✅ Returns structured JSON (not free-form text)
  • ✅ Graceful degradation on unrecognizable images

conditional

  • ✅ Condition expression evaluated via template engine
  • ✅ Both then and else branches supported
  • ✅ Nested conditionals supported up to 5 levels
  • ✅ No execution if condition evaluation fails

loop

  • ✅ Iterates over collections in state or operation results
  • ✅ Max 100 iterations per loop (prevents infinite loops)
  • ✅ Supports break on condition
  • ✅ Loop variable accessible in nested operations

build_url

  • ✅ Domain allowlist enforced (no arbitrary URLs)
  • ✅ Params are URL-encoded by default
  • ✅ Malformed URLs return error (don't generate invalid links)
  • ✅ HTTPS-only (no HTTP allowed)

schedule_task

  • ✅ Max 10 pending tasks per session
  • ✅ Max delay: 168 hours (7 days)
  • ✅ Task context serialized and deserialized correctly
  • ✅ Failed tasks retry up to max_retries times

All Operations (Universal Requirements)

  • ✅ Template interpolation works on all input fields
  • ✅ Output stored in $op.{output_name} for chaining
  • ✅ Errors logged with operation_id and context
  • ✅ Dry-run mode supported with realistic mock data

8) Forward-Compatibility & Schema Versioning

Schema Versioning (Ship in v1.1)

All superpower definitions include a schema_version field:

{
  "schema_version": "1.0",
  "metadata": {...}
}

Versioning Rules: - Breaking changes increment minor version (1.0 → 1.1) - Platform supports N-1 versions (current + previous) - Additive changes (new operations, new filters) don't break existing apps

Deprecation Timeline (16 Weeks)

When deprecating an operation, filter, or schema field:

  • Week 0: Announce deprecation in changelog and dev documentation
  • Week 4: Add warning logs when deprecated feature is used
  • Week 8: Show warning to app creator in marketplace dashboard
  • Week 12: Disable by default (feature flag to re-enable)
  • Week 16: Full removal from platform

Auto-Migration for Simple Changes

For renames and simple structure changes, provide auto-migration:

def migrate_v1_to_v1_1(definition):
    """Auto-migrate v1.0 definitions to v1.1"""
    if definition.get('schema_version') == '1.0':
        for action in definition.get('actions', []):
            for op in action.get('operations', []):
                # Example: Rename operation type
                if op['type'] == 'gmail_query':  # Old name
                    op['type'] = 'gmail_search'  # New name
        definition['schema_version'] = '1.1'
    return definition

Graceful Degradation

If an operation is removed entirely: - Replace with state_update that logs deprecation error - User sees: "This app uses a deprecated feature '[operation_name]'. Update needed." - Creator gets email: "Your app '[App Name]' needs updating" - App remains installed but shows error on invocation

Migration Tools

  • Command: /superpower upgrade [name] — auto-migrates app to latest schema version
  • Creator Dashboard: Shows apps needing updates with one-click upgrade button
  • Email Notifications: Sent at weeks 8, 12, and 15 for apps using deprecated features

9) Architecture

Execution Flow

User Message / Webhook POST / Cron Tick
  → SmartMessageRouter (intent classification) OR HMAC verification OR scheduler
  → MiniApp Handler Selection (first-party or GenericConfigHandler)
  → GenericConfigHandler
    1. Check conversation flow resumption (multi-turn)
    2. Filter actions by state machine (if defined)
    3. Match action via triggers (intent/phrase/keyword/state/routing_action/cron)
    4. Verify OAuth scopes
    5. Execute operations sequentially
       - Template interpolation on every field
       - Operation chaining via $op.{output}
       - Conditional branching and loops
       - Pause support for conversation_flow
    6. Fire inter-app events (if any emitted — Phase 9)
    7. Evaluate completion_criteria (if runtime evaluation ships)
    8. Check escalation rules on errors
    9. Evaluate auto-transitions (state machine)
    10. Interpolate response template
    11. Render UI blocks
  → MiniAppResponse (state + UI + messages)

Data Model

Terminology Note: Database uses MiniApp table names. User-facing features call them "superpowers."

Core tables (existing): - mini_apps — app definitions (JSON), metadata, creator - user_custom_miniapp_installs — user ↔ app relationship (per-user installs) - miniapp_sessions — per-user runtime state - miniapp_analytics_events — analytics event store - miniapp_versions — version history with review status - miniapp_templates — starter templates for common use cases - miniapp_webhooks — webhook configurations

New tables (for remaining phases): - superpower_event_subscriptions — inter-app event wiring (Phase 9) - superpower_cron_schedules — cron job definitions and next-fire timestamps (Phase 13)


10) Success Metrics

Metric Target Why
Apps created via vibe-coding 50+ in first month Validates creation UX
App completion rate (start → save) >60% Sage is generating usable apps
Dry-run pass rate >80% on first attempt Schema validation is helpful not blocking
Sandbox test → save conversion >70% Users trust what they test
Operations per app (median) 4-8 Sweet spot of useful but not overwhelming
p95 execution time <4s Responsive enough for chat context
Security incidents from UGC apps 0 Sandbox is working

Feature Flags & Gating

Flag Key Default Purpose
enable_superpower_system true Master switch for superpower runtime
max_superpower_slots_free 1 Slot limit for free-tier users
max_superpower_slots_paid 3 Slot limit for Superpowers+ subscribers (grows to 6 at L10 per PRD 3)
enable_capability_hints true Contextual superpower discovery prompts
superpower_creation_enabled true Allow users to create new superpowers via chat
enable_bill_split true Bill Split miniapp availability
enable_trip_planner true Trip Planner miniapp availability

See REFERENCE_FEATURE_FLAGS.md for the full catalog.


Telemetry

Event Trigger Properties
superpower_created User finishes creation flow superpower_id, operation_count, creation_method
superpower_installed User equips a superpower superpower_id, slot_index, source (marketplace/created)
superpower_triggered User invokes a superpower superpower_id, trigger_type, latency_ms
superpower_error Runtime error during execution superpower_id, error_type, operation_id
capability_hint_shown Hint displayed to user hint_type, superpower_id
capability_hint_acted User acts on a hint hint_type, action (installed/dismissed)

Needed but not yet tracked: - superpower_tier_up — when a superpower advances from Bronze → Silver → Gold - superpower_uninstalled — when a user removes a superpower - superpower_creation_abandoned — when creation flow is started but not completed - superpower_dry_run_executed — when user tests an app in sandbox mode

See REFERENCE_TELEMETRY.md for the full event catalog.


Definition of Done

  • All 15 operation types advertised to LLM in app_generator.py system prompt
  • completion_criteria and escalation fields added to CustomAppDefinition schema
  • completion_criteria runtime evaluation in GenericConfigHandler
  • Escalation rule routing on operation failures
  • All shipped features have integration tests in tests/integration/
  • Runtime sandbox prevents superpowers from accessing unauthorized data
  • Idempotency keys prevent duplicate operations on retry
  • Feature flags gate all new superpower capabilities
  • Telemetry events fire for creation, installation, trigger, and error
  • Free-tier slot limits enforced and tested (1 slot)
  • Paid-tier slot limits enforced and tested (3 starting → 6 at L10, per PRD 3)
  • Superpower creation flow handles edge cases (cancel, timeout, invalid config)
  • Performance: superpower execution completes within 3 seconds (p95)
  • Dry-run mode has clear UX indicators (test mode banner in responses)
  • Domain allowlist documented in security review process docs

Open Questions

  • Should first-party miniapps (Python handlers) migrate to JSON definitions over time for consistency, or keep the hybrid model?
  • What's the right UX for "this app needs Gmail access but you haven't connected Gmail yet" — inline OAuth flow or redirect to settings?
  • Should inter-app events (Phase 9) be synchronous (blocking) or async (fire-and-forget)?
  • Template filter library: should we open-source it and accept community contributions?