Skip to content

Group Photo Handling - Relay Implementation Guide

Date: December 2, 2025 Author: Backend Team Status: Ready for Implementation


Overview

This guide describes the changes needed in the Mac mini relay to support intelligent photo handling in group chats. The backend now supports deferred photo processing - photos in groups are only analyzed when Sage is actually mentioned, saving significant processing costs.

Why This Change?

Problem: Previously, every photo sent in a group chat was immediately processed (Vision API analysis, memory extraction) even if Sage was never mentioned. This was: - Expensive (Vision API costs) - Unnecessary (most group photos don't involve Sage) - Privacy-invasive (analyzing photos Sage wasn't asked about)

Solution: The relay now passes group context, and the backend decides: - Direct chat: Process immediately (user is talking to Sage) - Group + Sage mentioned in caption: Process immediately - Group + no mention: Store reference only, process later if Sage is mentioned


API Changes

Endpoint: POST /photo/upload

New Parameters

Parameter Type Required Default Description
chat_guid string No None Unique identifier for the chat (same as used in /orchestrator/message)
is_group boolean No false Whether this photo is from a group chat
caption string No None Photo caption text (if any). Backend checks for Sage mentions.

Existing Parameters (unchanged)

Parameter Type Required Description
file file Yes Photo file data
user_phone string Yes Sender's phone number
attachment_guid string No iMessage attachment GUID (for idempotency)
room_id string No Active mini-app room ID
context string No JSON context string
X-Edge-Agent-Id header No Edge agent ID for WebSocket delivery

Implementation Steps

Step 1: Detect Group Context

When processing an incoming iMessage with a photo attachment, determine if it's from a group chat.

def is_group_chat(message) -> bool:
    """
    Determine if message is from a group chat.

    iMessage chat_guid formats:
    - Direct: "iMessage;-;+15551234567"
    - Group:  "iMessage;+;chat123456789"

    The middle component indicates:
    - "-" = direct message
    - "+" = group chat
    """
    chat_guid = message.chat_guid
    parts = chat_guid.split(";")

    if len(parts) >= 2:
        return parts[1] == "+"

    # Fallback: check participant count
    return len(message.participants) > 2  # More than user + Sage

Step 2: Extract Caption

iMessage photos can have captions. Extract this text to pass to the backend.

def get_photo_caption(message) -> Optional[str]:
    """
    Extract caption from photo message.

    In iMessage, a caption is typically:
    - The 'text' field when a photo has accompanying text
    - Or the 'subject' field in some cases
    """
    # Check if there's text with the photo
    if message.text and message.text.strip():
        return message.text.strip()

    # Check subject field
    if hasattr(message, 'subject') and message.subject:
        return message.subject.strip()

    return None

Step 3: Update Photo Upload Call

Modify the photo upload function to include the new parameters.

async def upload_photo_to_backend(
    photo_data: bytes,
    message: iMessageObject,
    attachment_guid: str
) -> dict:
    """
    Upload photo to backend with group context.
    """
    # Determine group context
    is_group = is_group_chat(message)
    caption = get_photo_caption(message)

    # Prepare form data
    form_data = aiohttp.FormData()
    form_data.add_field('file', photo_data,
                        filename='photo.jpg',
                        content_type='image/jpeg')
    form_data.add_field('user_phone', message.sender)
    form_data.add_field('attachment_guid', attachment_guid)

    # NEW: Add group context
    form_data.add_field('chat_guid', message.chat_guid)
    form_data.add_field('is_group', str(is_group).lower())  # "true" or "false"
    if caption:
        form_data.add_field('caption', caption)

    # Add edge agent ID header
    headers = {
        'X-Edge-Agent-Id': get_edge_agent_id()
    }

    async with aiohttp.ClientSession() as session:
        async with session.post(
            f"{BACKEND_URL}/photo/upload",
            data=form_data,
            headers=headers
        ) as response:
            return await response.json()

Step 4: Handle Response Status

The backend now returns different statuses based on processing:

async def handle_photo_upload_response(response: dict, message: iMessageObject):
    """
    Handle the photo upload response.

    Response statuses:
    - "processing": Photo is being analyzed (direct chat or mentioned in group)
    - "stored": Photo stored for later (group, no mention)
    """
    status = response.get('status')
    photo_id = response.get('photo_id')

    if status == 'processing':
        # Photo is being analyzed - wait for WebSocket event or poll
        logger.info(f"Photo {photo_id} is being processed")
        # Existing flow: wait for analysis completion

    elif status == 'stored':
        # Photo stored but not processed - this is expected for groups
        logger.info(f"Photo {photo_id} stored for group (deferred processing)")
        # No need to wait - photo will be processed if Sage is mentioned later

    else:
        logger.warning(f"Unknown photo status: {status}")

Complete Example

Here's a complete example of the updated photo handling flow:

class PhotoHandler:
    """Handles photo messages from iMessage."""

    def __init__(self, backend_url: str, edge_agent_id: str):
        self.backend_url = backend_url
        self.edge_agent_id = edge_agent_id

    async def handle_photo_message(self, message: iMessageObject) -> None:
        """
        Process an incoming photo message.

        Args:
            message: The iMessage object containing photo attachment
        """
        try:
            # 1. Extract photo data
            photo_data = await self.download_attachment(message.attachment_path)

            # 2. Determine group context
            is_group = self._is_group_chat(message.chat_guid)
            caption = message.text or None  # Caption if any

            logger.info(
                f"Processing photo from {message.sender}",
                extra={
                    "chat_guid": message.chat_guid,
                    "is_group": is_group,
                    "has_caption": bool(caption)
                }
            )

            # 3. Upload to backend
            form_data = aiohttp.FormData()
            form_data.add_field('file', photo_data,
                               filename='photo.jpg',
                               content_type='image/jpeg')
            form_data.add_field('user_phone', message.sender)
            form_data.add_field('chat_guid', message.chat_guid)
            form_data.add_field('is_group', str(is_group).lower())

            if message.attachment_guid:
                form_data.add_field('attachment_guid', message.attachment_guid)

            if caption:
                form_data.add_field('caption', caption)

            headers = {'X-Edge-Agent-Id': self.edge_agent_id}

            async with aiohttp.ClientSession() as session:
                async with session.post(
                    f"{self.backend_url}/photo/upload",
                    data=form_data,
                    headers=headers
                ) as response:
                    result = await response.json()

            # 4. Handle response
            if result['status'] == 'processing':
                logger.info(f"Photo {result['photo_id']} queued for analysis")
                # Continue with existing WebSocket/polling flow

            elif result['status'] == 'stored':
                logger.info(f"Photo {result['photo_id']} stored (group, no mention)")
                # Nothing more to do - backend will process if Sage is mentioned

        except Exception as e:
            logger.error(f"Failed to handle photo: {e}", exc_info=True)

    def _is_group_chat(self, chat_guid: str) -> bool:
        """Check if chat_guid indicates a group chat."""
        # iMessage format: "iMessage;{+|-};{identifier}"
        # "+" = group, "-" = direct
        parts = chat_guid.split(";")
        return len(parts) >= 2 and parts[1] == "+"

Testing

Test Cases

  1. Direct chat photo (no change)

    Send photo in 1:1 chat with Sage
    Expected: status="processing", photo analyzed immediately
    

  2. Group photo without mention

    Send photo in group chat without mentioning Sage
    Expected: status="stored", photo NOT analyzed
    

  3. Group photo with caption mentioning Sage

    Send photo in group with caption "Hey Sage check this out"
    Expected: status="processing", photo analyzed immediately
    

  4. Group photo, then mention Sage

    1. Send photo in group (no mention) → status="stored"
    2. Send text "Sage what do you think of that photo?"
    Expected: Backend processes the pending photo, Sage responds with context
    

Verification

Check backend logs for these patterns:

# Group photo stored (no mention)
[Group] Stored photo reference without analysis (no Sage mention in caption): {photo_id}

# Group photo processed (mentioned)
Photo {photo_id} is being processed

# Deferred processing when Sage mentioned later
[Group] Found pending photo {photo_id}, analyzing now...
[Group] Processed deferred photo {photo_id}, extracted {N} memories

Backward Compatibility

All new parameters have safe defaults: - chat_guid: None - treated as direct chat - is_group: false - treated as direct chat - caption: None - no caption

Existing relay code will continue to work - all photos will be processed immediately (current behavior). The new group-aware behavior only activates when is_group=true is explicitly passed.


Questions?

Contact the backend team if you have questions about: - Chat GUID format detection - Caption extraction from iMessage - WebSocket event handling for deferred photos - Testing group photo scenarios