Personal Orchestrator Full-Breadth Operational Plan

Core thesis

The Personal Orchestrator should not be evaluated as “a Telegram cron.” It should be evaluated as a full cognitive supply chain: integrations → raw/source-near stores → extractors → context assets → faculty agents → orchestrator synthesis → action / feedback / observability.

The current system has meaningful operational proof, but the next plan needs to close the gap between backend breadth and felt collaborator experience. The missing product layer is not another screenshot idea or another digest; it is a fully observable loop where Connor can inspect, consult, correct, and benefit from the orchestrator’s work.

The cognitive supply chain to verify

Every integration must prove it can sync, preserve raw evidence, extract useful context assets, feed real faculties, influence orchestration, and accept feedback. A source that only appears in a report is not fully operational.

SourcesOpen Tabs, Gmail, Calendar, audio memory, Signal Radar, GBrain, cron outputs, future desktop vision.

Raw storesHeartbeat/source-near DBs with freshness, replayability, and error state.

ExtractorsTurn raw records into typed context assets with source proof.

Context assetsWorking-memory objects that faculty agents can retrieve and cite.

FacultiesReal subagents, not stubs, each judging from a cognitive perspective.

OrchestratorSynthesizes, routes, acts, suppresses, asks, and learns from feedback.

Two axes that must stay separate

Axis 1 — Source breadth

Are all relevant integrations live, fresh, extractable, and represented as context assets?

Each source has a freshness timestamp.
Each source has raw/source-near records.
Each source has an extractor.
Each source creates useful context assets.
Each source has failure visibility and recovery notes.

Axis 2 — Interaction breadth

Can Connor actually interface with the intelligence stack as a collaborator?

Askable consults from the main Hermes chat.
Reactive scans that can move safe work.
Telegram only for high-signal interrupts, approvals, and summaries.
Observable workbench for watching, working, done, suppressed, broken.
Feedback visibly changes future routing and suppression.

What is working vs. what still feels incomplete

Working

Real faculty orchestration exists in the latest run artifacts.
Observable run artifacts exist by default.
Context asset layer is active, with source coverage and asset counts.
Scheduled operating loops exist for extraction, reactive scan, and Telegram nudge.
Feedback/action loop exists through useful/noisy/wrong/do_this.

Underdeveloped

The interface is too sparse; a 4h nudge feels like a report, not a collaborator.
The 5m reactive loop is local-only and not felt by Connor.
No first-class conversational PO consult path exists yet.
Connor messages are not yet an event-triggered PO intelligence path.
Background work is not self-evident in one workbench.

Proof risks

Counting a source as healthy without raw sync + extractor + asset proof.
Counting a faculty as running without real agent artifacts.
Claiming proactivity without evidence of safe work moved or suppressed.
Letting Telegram become the whole product surface.
Publishing E2E claims without Connor feedback effects being observable.

The corrected interaction architecture

Interaction architecture is about how Connor interfaces with the system. It is not the same as adding more integrations.

Reactive collaborator lane

Frequent cheap scans classify changes into store-only, safe local work, batch digest, ask Connor, interrupt now, or delegate background.

Conversational consult lane

Connor can ask Hermes to consult PO. The answer should use latest run artifacts, context assets, and optionally selected faculties.

Interrupt / nudge lane

Telegram becomes an attention-protected lane for P0/P1 interrupts, approvals, one high-leverage question, and completed background work.

Workbench / observability lane

A visible board shows Watching, Working on, Needs approval, Done, Suppressed, and Broken integrations.

Desktop vision is an integration, not an interaction mode

Desktop vision belongs beside Gmail, Calendar, Open Tabs, audio memory, Signal Radar, GBrain, and cron outputs. It is a source feed that expands what the orchestrator can perceive.

Integration path

desktop vision
  → raw visual heartbeat DB
  → OCR / active app / window metadata extractors
  → visual context assets
  → faculty inputs
  → orchestrator synthesis / actions / consults

Safety requirements before any loop

Local-first by default.
Pause / kill switch.
App and window blocklist.
Password, secret, and 2FA redaction.
Separate raw image retention from derived context assets.
No screenshots sent to general LLM context unless explicitly allowed or redacted.

Recommended build roadmap

Slice 1 — PO consult skill + runner

Make the orchestrator askable from normal Hermes chat. Create a skill and runner with quick, deep, and background modes. Every consult writes evidence artifacts.

/root/.hermes/skills/.../personal-orchestrator-consultrunners/consult.py

Slice 2 — Reactive delivery and routing

Upgrade the 5m loop so it can route store-only, local safe work, background delegation, approval, interrupt, and batch digest decisions without spamming.

runners/reactive_dispatch.pytests/test_reactive_delivery_policy.py

Slice 3 — PO Workbench

Create one inspectable state surface for watching, working, needs Connor, done, suppressed, and broken sources. Make nudges and consults cite it.

state/PO_WORKBENCH.mdstate/po_workbench.json

Slice 4 — Desktop visual integration spike

Validate the visual source feed only after redaction and retention policy exist. One test capture should create one heartbeat row and one context asset.

spikes/desktop-visual-screentimeraw visual heartbeat DB

Slice 5 — Full E2E breadth evaluation

Evaluate all source integrations, all extractors, all context asset skills, all faculty agents, the orchestrator synthesis, the action layer, and feedback effects with a visible proof ledger.

source breadth matrixfaculty proof matrixfeedback impact log

Definition of done

Askable

“What does my orchestrator think?” returns an evidence-backed answer with cited artifacts.

Proactive

It finds work and moves safe local pieces forward without needing explicit prompts.

Omnipresent

It continuously sees live source feeds; desktop visual context is an audited integration when added.

Attention-protected

Telegram interrupts are rare, meaningful, and feedback-aware.

Observable

Every answer, nudge, action, suppression, source failure, and feedback effect has an artifact path.

Action-capable

do_this and PO-initiated safe tasks enter a visible workbench/delegation lifecycle.

Source plan, preserved for review

The full markdown plan is preserved below so the depth can be reviewed without losing the original implementation detail.

# Personal Orchestrator Interaction Model Assessment + Redesign Plan

> **For Hermes:** Use agent-ready-requirements + subagent-driven-development if this becomes implementation work. Preserve observable artifacts and real faculty proof.

**Goal:** Reassess Connor's current Personal Orchestrator as an effective proactive/omnipresent collaborator and redesign how Connor interfaces with it beyond a 4-hour Telegram nudge.

**Current thesis:** The lower-layer Personal Orchestrator is becoming operational and observable, but the interface model is too narrow. It currently behaves like a scheduled digest + occasional card system, not yet like an omnipresent collaborator Connor can ask, interrupt, delegate to, or be ambiently understood by.

---

## Current operational status, grounded in live artifacts

### What is working

1. **Real faculty orchestration exists.**
- Latest manifest: `/root/personal-orchestrator/runs/latest/manifest.json`.
- Latest run shows all 15 expected faculties with `kind: faculty-hermes-real-agent` and `status: ok`.
- This is materially better than a deterministic cron digest.

2. **Observable run artifacts exist by default.**
- `TELEGRAM_NUDGE.md`
- `COLLABORATOR_OUTPUT.md`
- `AGENT_RUNS.md`
- `manifest.json`
- `synthesis.json`
- `ACTION_MANAGER.md`
- `ACTION_CANDIDATES.json`
- `ACTION_QUALITY_EVALS.md`
- `REAL_AGENT_GUARDRAILS.md`

3. **Context asset layer is active.**
- Latest source coverage: `/root/personal-orchestrator/state/context_assets/source_coverage.md`.
- Sources total: 11.
- Sources with source-near records: 11.
- Sources with extractors: 11.
- Sources with assets: 11.
- Asset count: 1163.

4. **There is a scheduled operating loop.**
- `personal-context-assets-extract`: every 30m, local.
- `personal-orchestrator-reactive-loop`: every 5m, local.
- `personal-intelligence-orchestrator-local`: every 240m, Telegram.

5. **Feedback/action loop exists.**
- Latest nudge asks for `useful/noisy/wrong/do_this`.
- `do_this` can create delegation/action status artifacts.
- Latest action status index includes a completed delegation from card `0ea9fbcb66`.

### What feels wrong / underdeveloped

1. **The interface is too sparse.**
- One Telegram nudge every 4h makes the system feel like a scheduled report, not an ambient collaborator.
- The nudge is attention-protected, but also too narrow to express the full intelligence behind it.

2. **The 5m reactive loop is local-only.**
- It reads local backlog and writes local artifacts, but deliberately does not Telegram-deliver.
- This means the system may be “working” while Connor does not feel it.

3. **No first-class conversational interface.**
- Connor cannot naturally ask: “what is my orchestrator seeing?”, “what should I do now?”, “what patterns are emerging?”, “what is stuck?”
- Hermes chat does not automatically consult PO unless the assistant knows to inspect `/root/personal-orchestrator`.

4. **No event-triggered intelligence path for Connor messages.**
- User messages in Telegram are not yet treated as live source events that can trigger PO consultation, context asset retrieval, or background faculty runs.

5. **The integration/source breadth is not yet complete.**
- It sees existing structured integrations and audio memory, but not desktop visual context.
- Connor's idea of a laptop app that periodically captures screen images should be treated as another source integration feeding raw heartbeat data and context assets, not as an interaction architecture.

6. **Background work is not yet self-evident.**
- There is some reactive dispatch and delegation state, but not a clear “what I found / what I did / what I am watching / what I need” interface.

---

## Desired product shape

The Personal Orchestrator should have **three interaction modes** backed by a broad integration/source layer.

### Source/integration layer — ambient ingestion

Continuously sync live source feeds into raw heartbeat databases.

Examples:

- GBrain changes
- Open Tabs
- Gmail/calendar
- audio memory
- Signal Radar
- cron outputs
- future desktop visual screentime
- future finance/messaging live feeds

Output:

```text
live source/feed
-> raw heartbeat sync DB
-> extractor
-> context assets
```

Acceptance:

- every source has freshness, depth, errors, and last useful asset proof;
- no export/manual import path counts as healthy integration.

### 1. Reactive collaborator mode

Run cheap trigger checks frequently, but only interrupt Connor when valuable.

Cadence proposal:

- every 5m: local reactive scan, no LLM unless needed;
- every 15m: context asset refresh / source freshness check;
- event-triggered: when Connor sends a Telegram/Hermes message with PO intent or high-signal content;
- interrupt only for P0/P1 or explicit conversational requests.

The reactive loop should decide:

- `store_only`
- `local_safe_work`
- `batch_digest`
- `ask_connor`
- `interrupt_now`
- `delegate_background`

### 2. Conversational consult mode

Connor should be able to ask the main Hermes agent questions that explicitly consult the Personal Orchestrator.

Example trigger phrases:

- “ask my orchestrator…”
- “what does PO think?”
- “what should I do now?”
- “what am I missing?”
- “what patterns are emerging?”
- “what is stuck?”
- “what has it noticed?”

Implementation options:

#### Option A — Skill-level consult

Create a `personal-orchestrator-consult` skill that any Hermes chat can load.

It should:

1. inspect latest PO run artifacts;
2. retrieve relevant context assets for the user's question;
3. optionally launch bounded real faculty subset or full background run;
4. answer from PO evidence;
5. write a consult artifact;
6. optionally queue deeper work.

#### Option B — CLI tool / runner

Create:

```bash
python3 runners/consult.py --question "..." --mode quick|deep|background
```

Modes:

- `quick`: latest run + context asset retrieval, answer in seconds.
- `deep`: spawn selected real faculties, answer with proof.
- `background`: queue a full run/delegation and report status.

#### Option C — Hermes hook/plugin

Longer term: when Connor sends any substantive message, a middleware classifies whether PO consultation is useful and injects PO context into the main agent automatically.

Recommended path: build Option A + B first, then consider plugin/hook.

### 3. Background problem-moving mode

The orchestrator should not only suggest; it should move safe work forward.

Examples:

- produce local diagnostic reports;
- reconcile contradictory system state;
- draft specs/plans;
- queue review packets;
- update local status/index artifacts;
- run source-health repairs that do not mutate external systems;
- propose approval cards for external-risk work.

Key UX:

- no chat hijack;
- notify only on meaningful state changes;
- maintain a visible “what PO is working on” board.

Artifacts:

- `state/po_workbench.json`
- `state/PO_WORKBENCH.md`
- `state/action_status_index.json`
- `state/ACTION_STATUS_INDEX.md`
- delegation packets under `delegations/`.

---

## Desktop visual screentime integration concept

Connor's idea: install a laptop app that periodically captures desktop screenshots/photos and sends them back for processing into a database and context assets.

**Classification:** this is another Personal Orchestrator source integration. It belongs beside Gmail, Calendar, Open Tabs, audio memory, Signal Radar, etc. It is not itself an interaction architecture.

### Integration role

This is not “surveillance for screenshots.” It is a **visual ambient context feed** that expands the source layer.

It can answer:

- What apps/workspaces has Connor actually been in?
- What task was visually active when he switched context?
- Is he stuck in the same tool/error loop?
- Is there evidence of deep work, distraction, admin, coding, writing, meetings?
- What on-screen artifact should PO remember or ask about?

### Architecture

```text
macOS desktop agent
-> periodic screenshot / active window metadata / OCR
-> local redaction + compression
-> encrypted upload or local sync
-> raw visual heartbeat DB
-> OCR/object/context extractors
-> context assets
-> faculties: focus, execution, pattern, self-healing, knowledge-gaps
```

### Data model

Raw visual heartbeat table:

- `capture_id`
- `captured_at`
- `device_id`
- `active_app`
- `active_window_title`
- `display_id`
- `image_path` or blob ref
- `ocr_text`
- `redaction_status`
- `sensitivity_score`
- `activity_class`
- `source_hash`

Derived context assets:

- `visual_focus_block`
- `stuck_loop_signal`
- `tool_error_seen`
- `coding_session`
- `writing_session`
- `admin_session`
- `context_switch_burst`
- `open_artifact_reference`

### Safety requirements

- local-first by default;
- pause/kill switch;
- app/window blocklist;
- password/secret/2FA redaction;
- private browsing/sensitive apps disabled by default;
- image retention policy separate from derived assets;
- user-visible indicator when capture is active;
- no screenshots sent to general LLM context unless explicitly allowed or redacted.

### Implementation candidates

- macOS ScreenCaptureKit app/daemon.
- Printing Press `agent-capture-pp-cli` for prototype captures, but it is a tool, not yet a resident app.
- Lightweight menu-bar app for permissions, pause/resume, source health, and sampling interval.

### Cadence

Initial safe default:

- capture metadata every 30–60s;
- screenshot every 3–5m while active;
- increase sampling temporarily when a local trigger fires, e.g. repeated error window, active coding session, or Connor asks “what was I doing?”;
- never capture during blocklisted apps/windows.

---

## Proposed new interface model

### Telegram becomes an interrupt lane, not the product

Telegram should receive:

- P0/P1 interrupts;
- daily/evening synthesis;
- approvals needed;
- completed background work;
- one high-leverage question.

Telegram should not be the only PO interface.

### Main Hermes chat becomes the consult lane

When Connor chats with Hermes, PO should be available as a consultable intelligence substrate.

Example commands/phrases:

- “consult PO”
- “ask my orchestrator”
- “what does my collaborator think?”
- “what am I missing across my sources?”
- “what should I focus on next?”
- “run faculties on this”

### Local artifacts become the observability lane

Everything important should have a proof path:

- latest run;
- latest consult;
- action/workbench;
- source health;
- visual heartbeat health;
- feedback effects.

### Background workbench becomes the autonomy lane

Create one place Connor can inspect:

```text
Watching
Working on
Needs approval
Done since last check
Suppressed / not bothering you about
Broken integrations
```

---

## Recommended next build slice

### Slice 1 — PO consult skill + runner

**Goal:** Make the orchestrator askable from normal Hermes chat.

Files:

- Create: `/root/.hermes/skills/personal-orchestrator/personal-orchestrator-consult/SKILL.md`
- Create: `/root/personal-orchestrator/runners/consult.py`
- Create: `/root/personal-orchestrator/tests/test_consult.py`
- Output: `/root/personal-orchestrator/consults/<timestamp>/CONSULT.md`
- Output: `/root/personal-orchestrator/consults/<timestamp>/consult.json`

Modes:

- `quick`: latest artifacts + context asset retrieval.
- `deep`: selected real faculties.
- `background`: queue full run and return tracking id.

Acceptance:

- Connor can ask a question in Telegram; Hermes loads the skill and answers from PO artifacts/context assets.
- Consult writes evidence artifact.
- If answer needs deeper processing, it queues background run instead of faking certainty.

### Slice 2 — Improve reactive loop delivery/routing

**Goal:** Make 5m loop meaningful without spamming.

Files:

- Modify: `runners/reactive_dispatch.py`
- Modify: `/root/.hermes/scripts/personal_orchestrator_reactive.sh`
- Create: `tests/test_reactive_delivery_policy.py`

Add decisions:

- `local_safe_work`
- `delegate_background`
- `needs_approval`
- `interrupt_now`
- `batch_digest`

Acceptance:

- P0 can notify Telegram.
- P1 can batch or notify with cooldown.
- Local safe work can be queued/executed silently with artifacts.
- Suppressions are inspectable.

### Slice 3 — PO Workbench

**Goal:** Make autonomy visible.

Files:

- Create: `runners/workbench.py`
- Output: `state/PO_WORKBENCH.md`
- Output: `state/po_workbench.json`

Sections:

- Watching
- Working on
- Needs Connor
- Done
- Suppressed
- Broken sources

Acceptance:

- Latest Telegram nudge links/summarizes Workbench state.
- Hermes consult skill can cite Workbench.

### Slice 4 — Desktop visual screentime integration spike

**Goal:** Validate whether screen capture can become a live source integration safely.

Files:

- Create: `spikes/desktop-visual-screentime/README.md`
- Create: `spikes/desktop-visual-screentime/schema.sql`
- Create: `spikes/desktop-visual-screentime/redaction_policy.md`
- Create: `spikes/desktop-visual-screentime/prototype_plan.md`

Prototype path:

1. macOS capture agent or `agent-capture-pp-cli` prototype.
2. OCR + active app/window metadata.
3. Local raw visual heartbeat DB.
4. Extractor into context assets.
5. Faculty packet injection.

Acceptance:

- No screenshots leave local machine without policy.
- Redaction/blocklist exists before capture loop.
- One test capture creates one raw heartbeat row and one context asset.

---

## Open product questions

1. Should PO consult automatically on every substantive Hermes message, or only when the classifier sees PO-intent / personal operational relevance?
2. Should deep faculty consults run synchronously in chat or always background with status id?
3. What is the default interrupt threshold for Telegram: P0 only, P0/P1, or user-configurable quiet hours?
4. For desktop capture, should screenshots be retained, downsampled, OCR-only after extraction, or kept for a short retention window?
5. Should visual screentime be a separate source-health lane so Connor can see exactly what it captured and what it ignored?

---

## Definition of done for the redesign

The PO interface redesign is successful when Connor can experience all of these:

1. **Askable:** “what does my orchestrator think?” returns an evidence-backed answer.
2. **Proactive:** it finds work and moves local-safe pieces forward without needing explicit prompts.
3. **Omnipresent:** it continuously sees live source feeds, with desktop visual context treated as another audited integration when added.
4. **Attention-protected:** Telegram interrupts are rare, meaningful, and feedback-aware.
5. **Observable:** every answer, nudge, action, and suppression has an artifact path.
6. **Action-capable:** `do_this` or PO-initiated safe tasks enter a visible workbench/delegation lifecycle.
7. **Self-improving:** Connor feedback changes future surfacing, routing, and suppression.