Architecture
OpenPhone is built as an Android/LineageOS-derived ROM with an OpenPhone-owned AI layer added as privileged OS components.
Layers
Applications
third-party apps
system apps
OpenPhoneAssistant
OpenPhone AI Layer
assistant UI
agent orchestrator
agent runtime jobs
runtime agent protocol
external runtime adapters
screen understanding
action execution
policy and consent
audit log
model runtime adapter
Android Framework
ActivityTaskManager
WindowManager
InputManager
NotificationManager
PackageManager
PermissionManager
System
system_server
Binder
init
SELinux
HAL/vendor interfaces
kernel
Device
device tree
kernel config
vendor blobs
firmware expectationsInitial Implementation
The current repo implements the first OpenPhone product layer:
vendor/openphoneproduct config.OpenPhoneAssistantprivileged app with a chat-style user surface, text and voice task entry, stateful mic/send/stop composer action, advanced model and developer controls, task grants, screen context, trace/audit export, OTA preview controls, and audit controls.- Initial capability and policy config files.
scripts/check.shvalidates that the assistant fallbackPolicyEnginecovers every capability inopenphone_capabilities.jsonwith the same risk class. - Initial model-tool registry.
openphone_model_tools.jsonmaps model-visible tools to product capabilities, and repo checks validate that the framework tool executor and OpenAI adapter cover the registered vocabulary. The executor enforces non-empty model-visible reasons for tools markedrequires_reason. - Agent Runtime V1 background jobs. The first assistant-side implementation persists background agent turns, schedules them with AlarmManager, wakes them at boot/package replacement/service startup, delivers notification results, and blocks state-changing background tools until a foreground reviewed approval flow exists. The runtime contract is documented in AGENT_RUNTIME_V1.md.
- Runtime Agent Protocol for external agent runtimes. This is an OpenPhone AI
layer boundary, not a separate product stack: Android owns sessions, screen
context, tool execution, confirmations, and audit, while adapters map remote
runtimes such as OpenClaw and future Hermes onto the same phone tool surface.
The protocol is documented in
runtime/runtime-agent-protocol.md, with
machine-readable command/event/capability manifests in
runtime/protocol. - Local manifest and patch-stack workflow.
- Hidden OpenPhone framework manager and Binder service.
system_serverOpenPhone agent manager service.- Foreground/visible activity context from ActivityTaskManager.
- First mediated action path for opening apps.
- Mediated web-link launch path for model
open_urltools, using the frameworknetwork.useaction policy/audit path instead of direct assistant intent launching. - Contract validation that action types emitted by the assistant framework tool
executor are present in
schemas/action-request.schema.jsonand have matching framework patch-stack handling. - Audit contract validation that framework
recordAudit(...)event names are represented inschemas/audit-event.schema.json. - Audit evidence export validation for assistant-exported framework audit JSON.
- Trajectory event contract validation for assistant-exported
events.jsonltraces. - Trajectory export validation for zipped or unpacked eval evidence, including screenshot references and leakage checks.
- Screen-context contract coverage for assistant accessibility UI-tree snapshots, including window metadata, element state, sensitivity flags, and risk hints.
- Task-scoped input action execution for navigation, pointer gestures, scroll, and text.
- Semantic UI-element targeting through assistant accessibility snapshots:
model tools can request
tap_elementorlong_press_elementby element ID, and the assistant resolves enabled bounded elements to OS-mediated pointer actions. - Confirmed clipboard write and paste actions.
- Confirmed share chooser actions.
- One-shot pending action confirmation through the assistant control surface.
- Durable framework audit log under
/data/system/openphone/. - Model adapter transport split between direct development provider calls and an OpenPhone broker/proxy mode. The broker mode keeps provider API keys off the phone by sending Responses/transcription-shaped requests to an OpenPhone-controlled endpoint with a session token.
- First reference model broker under
services/model-broker/for development deployments. It accepts phone-side broker requests, validates bearer session tokens, applies coarse size/rate limits, avoids request-body logging, and forwards to OpenAI. It supports static development tokens and signed expiring HMAC session tokens minted either by CLI or an admin-authenticated/v1/session_tokensendpoint, structured JSONL request-outcome audit events, a JSON provider/model registry with an environment model-allowlist override for local testing, and a JSON device-subject registry with optional development HMAC proof for restricting session token issuance. - Settings-owned OpenPhone task-grant defaults and first app capability policy
editor. The app policy editor writes
openphone_app_policy_overridesinSettings.Secure, using the same JSON contract consumed by assistant-side preflight before the system_ext seed policy.
This is enough to start producing OpenPhone-flavored builds once the Android tree is synced and to validate the first OS-level agent capability path. It is not yet the final AI-native OS integration.
Required Framework Work
The following components still need real Android framework implementation:
- Window/UI hierarchy extraction and screenshot/OCR perception.
- Framework-owned UI hierarchy extraction and production-grade UI-element target resolution. The current semantic targeting path is assistant-side and development-oriented.
- Notification and app-specific action integrations.
- Rich confirmation and grant lifecycle UI.
- Privilege separation between assistant UI, orchestrator, and executors.
- SELinux domains and neverallow-compatible policy.
- Production model broker identity and operations: managed certificate rollout, abuse controls, stronger device attestation, admin-token and provider-secret rotation, and production privacy enforcement. A first helper exists for rotating broker token/admin secrets and provider keys, and a first certbot/nginx helper exists for broker TLS certificate setup.
- Richer Settings-hosted durable grant editor for app-specific rules.
- Richer background task visibility in SystemUI.
Design Rule
All sensitive actions must pass through policy below the assistant UI layer. The assistant UI may request an action, but it must not be the authority that decides whether the action is allowed.