Group Photo Handling - Relay Implementation Guide¶
Date: December 2, 2025 Author: Backend Team Status: Ready for Implementation
Overview¶
This guide describes the changes needed in the Mac mini relay to support intelligent photo handling in group chats. The backend now supports deferred photo processing - photos in groups are only analyzed when Sage is actually mentioned, saving significant processing costs.
Why This Change?¶
Problem: Previously, every photo sent in a group chat was immediately processed (Vision API analysis, memory extraction) even if Sage was never mentioned. This was: - Expensive (Vision API costs) - Unnecessary (most group photos don't involve Sage) - Privacy-invasive (analyzing photos Sage wasn't asked about)
Solution: The relay now passes group context, and the backend decides: - Direct chat: Process immediately (user is talking to Sage) - Group + Sage mentioned in caption: Process immediately - Group + no mention: Store reference only, process later if Sage is mentioned
API Changes¶
Endpoint: POST /photo/upload¶
New Parameters¶
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
chat_guid |
string | No | None |
Unique identifier for the chat (same as used in /orchestrator/message) |
is_group |
boolean | No | false |
Whether this photo is from a group chat |
caption |
string | No | None |
Photo caption text (if any). Backend checks for Sage mentions. |
Existing Parameters (unchanged)¶
| Parameter | Type | Required | Description |
|---|---|---|---|
file |
file | Yes | Photo file data |
user_phone |
string | Yes | Sender's phone number |
attachment_guid |
string | No | iMessage attachment GUID (for idempotency) |
room_id |
string | No | Active mini-app room ID |
context |
string | No | JSON context string |
X-Edge-Agent-Id |
header | No | Edge agent ID for WebSocket delivery |
Implementation Steps¶
Step 1: Detect Group Context¶
When processing an incoming iMessage with a photo attachment, determine if it's from a group chat.
def is_group_chat(message) -> bool:
"""
Determine if message is from a group chat.
iMessage chat_guid formats:
- Direct: "iMessage;-;+15551234567"
- Group: "iMessage;+;chat123456789"
The middle component indicates:
- "-" = direct message
- "+" = group chat
"""
chat_guid = message.chat_guid
parts = chat_guid.split(";")
if len(parts) >= 2:
return parts[1] == "+"
# Fallback: check participant count
return len(message.participants) > 2 # More than user + Sage
Step 2: Extract Caption¶
iMessage photos can have captions. Extract this text to pass to the backend.
def get_photo_caption(message) -> Optional[str]:
"""
Extract caption from photo message.
In iMessage, a caption is typically:
- The 'text' field when a photo has accompanying text
- Or the 'subject' field in some cases
"""
# Check if there's text with the photo
if message.text and message.text.strip():
return message.text.strip()
# Check subject field
if hasattr(message, 'subject') and message.subject:
return message.subject.strip()
return None
Step 3: Update Photo Upload Call¶
Modify the photo upload function to include the new parameters.
async def upload_photo_to_backend(
photo_data: bytes,
message: iMessageObject,
attachment_guid: str
) -> dict:
"""
Upload photo to backend with group context.
"""
# Determine group context
is_group = is_group_chat(message)
caption = get_photo_caption(message)
# Prepare form data
form_data = aiohttp.FormData()
form_data.add_field('file', photo_data,
filename='photo.jpg',
content_type='image/jpeg')
form_data.add_field('user_phone', message.sender)
form_data.add_field('attachment_guid', attachment_guid)
# NEW: Add group context
form_data.add_field('chat_guid', message.chat_guid)
form_data.add_field('is_group', str(is_group).lower()) # "true" or "false"
if caption:
form_data.add_field('caption', caption)
# Add edge agent ID header
headers = {
'X-Edge-Agent-Id': get_edge_agent_id()
}
async with aiohttp.ClientSession() as session:
async with session.post(
f"{BACKEND_URL}/photo/upload",
data=form_data,
headers=headers
) as response:
return await response.json()
Step 4: Handle Response Status¶
The backend now returns different statuses based on processing:
async def handle_photo_upload_response(response: dict, message: iMessageObject):
"""
Handle the photo upload response.
Response statuses:
- "processing": Photo is being analyzed (direct chat or mentioned in group)
- "stored": Photo stored for later (group, no mention)
"""
status = response.get('status')
photo_id = response.get('photo_id')
if status == 'processing':
# Photo is being analyzed - wait for WebSocket event or poll
logger.info(f"Photo {photo_id} is being processed")
# Existing flow: wait for analysis completion
elif status == 'stored':
# Photo stored but not processed - this is expected for groups
logger.info(f"Photo {photo_id} stored for group (deferred processing)")
# No need to wait - photo will be processed if Sage is mentioned later
else:
logger.warning(f"Unknown photo status: {status}")
Complete Example¶
Here's a complete example of the updated photo handling flow:
class PhotoHandler:
"""Handles photo messages from iMessage."""
def __init__(self, backend_url: str, edge_agent_id: str):
self.backend_url = backend_url
self.edge_agent_id = edge_agent_id
async def handle_photo_message(self, message: iMessageObject) -> None:
"""
Process an incoming photo message.
Args:
message: The iMessage object containing photo attachment
"""
try:
# 1. Extract photo data
photo_data = await self.download_attachment(message.attachment_path)
# 2. Determine group context
is_group = self._is_group_chat(message.chat_guid)
caption = message.text or None # Caption if any
logger.info(
f"Processing photo from {message.sender}",
extra={
"chat_guid": message.chat_guid,
"is_group": is_group,
"has_caption": bool(caption)
}
)
# 3. Upload to backend
form_data = aiohttp.FormData()
form_data.add_field('file', photo_data,
filename='photo.jpg',
content_type='image/jpeg')
form_data.add_field('user_phone', message.sender)
form_data.add_field('chat_guid', message.chat_guid)
form_data.add_field('is_group', str(is_group).lower())
if message.attachment_guid:
form_data.add_field('attachment_guid', message.attachment_guid)
if caption:
form_data.add_field('caption', caption)
headers = {'X-Edge-Agent-Id': self.edge_agent_id}
async with aiohttp.ClientSession() as session:
async with session.post(
f"{self.backend_url}/photo/upload",
data=form_data,
headers=headers
) as response:
result = await response.json()
# 4. Handle response
if result['status'] == 'processing':
logger.info(f"Photo {result['photo_id']} queued for analysis")
# Continue with existing WebSocket/polling flow
elif result['status'] == 'stored':
logger.info(f"Photo {result['photo_id']} stored (group, no mention)")
# Nothing more to do - backend will process if Sage is mentioned
except Exception as e:
logger.error(f"Failed to handle photo: {e}", exc_info=True)
def _is_group_chat(self, chat_guid: str) -> bool:
"""Check if chat_guid indicates a group chat."""
# iMessage format: "iMessage;{+|-};{identifier}"
# "+" = group, "-" = direct
parts = chat_guid.split(";")
return len(parts) >= 2 and parts[1] == "+"
Testing¶
Test Cases¶
-
Direct chat photo (no change)
-
Group photo without mention
-
Group photo with caption mentioning Sage
-
Group photo, then mention Sage
Verification¶
Check backend logs for these patterns:
# Group photo stored (no mention)
[Group] Stored photo reference without analysis (no Sage mention in caption): {photo_id}
# Group photo processed (mentioned)
Photo {photo_id} is being processed
# Deferred processing when Sage mentioned later
[Group] Found pending photo {photo_id}, analyzing now...
[Group] Processed deferred photo {photo_id}, extracted {N} memories
Backward Compatibility¶
All new parameters have safe defaults:
- chat_guid: None - treated as direct chat
- is_group: false - treated as direct chat
- caption: None - no caption
Existing relay code will continue to work - all photos will be processed immediately (current behavior). The new group-aware behavior only activates when is_group=true is explicitly passed.
Questions?¶
Contact the backend team if you have questions about: - Chat GUID format detection - Caption extraction from iMessage - WebSocket event handling for deferred photos - Testing group photo scenarios