Shakeeb Debug Audit — Design Decisions

F3 · War Room placeholders Medium

War Room HUD modules (priority heat, continuity banner, inbound callout, ambient telemetry) render illustrative copy in non-demo builds. Source comments at app/(atlas)/voice/page.tsx:28-32 and :57-65 mark them as Phase 1.6.6 work. Risk: during a real debug session you read the placeholder thought-stream as Shakeeb's actual reasoning. Pick how to make the staged content unambiguous.

Option A · Wire now

Connect every HUD module to a real endpoint this week.

Build out the priority-heat / continuity / inbound / telemetry feeds as Phase 1.6.6 originally planned. Removes placeholders by replacing them with truth.

Preview · live data

Priority heat (last 1h)

7 actionable

3 commits · 2 PR comments · 2 emails awaiting reply

Pros

No ambiguity — every HUD card is real.
Closes Phase 1.6.6 work proactively.
War Room becomes the actual mission view it claims to be.

Cons

Real work — 4 endpoints + their UI wiring (~1–2 days).
Each new feed brings its own latency / failure modes.
Premature if ambient telemetry isn't load-bearing yet.

Risk medium Effort 1–2 days Reversible yes

My pick

Option B · Demote with pill

Mark every placeholder card with a "DEMO DATA" pill outside `?demo=1` mode.

Zero risk, ~30 min. Wraps the placeholder cards in a styled pill that visually demotes them. Real cards stay full-fidelity. War Room shell stays true to the mockup, debug sessions can't be misled by staged copy.

Preview · pill on placeholder cards

Demo data

Priority heat (last 1h)

7 actionable

3 commits · 2 PR comments · 2 emails awaiting reply

Pros

Ships in <30 min · no new endpoints.
Visual hierarchy still matches mockup — placeholders just look quieter.
Pill carries one canonical signal — easy to grep + remove later.

Cons

Doesn't close the Phase 1.6.6 work — just labels it.
Adds visual noise to cards user will eventually want clean.

Risk none Effort ~30 min Reversible trivially

Option C · Hide outside demo

Don't render placeholder cards at all unless `?demo=1`.

Cards are present in demo mode for screenshot / video; absent in real builds. War Room loses a few modules until 1.6.6 wires them, but the layout shrinks gracefully.

Preview · empty slot, layout collapses

— hidden in non-demo builds —

Pros

Cleanest result — no fake data anywhere.
Zero risk of debug-session confusion.

Cons

War Room shell stops matching the mockup — empty-feeling.
Layout has to handle missing cards gracefully (some grid math).
Forgetting ?demo=1 in screenshots → support team sees stripped UI.

Risk low Effort ~45 min Reversible yes

F4 · Mixed text+voice transcript model Medium

Today the Presence rail merges text + voice turns in-memory, but persistence splits them — voice starts a fresh Voice session — HH:MM conversation row regardless of any active text conversation (useVoiceSession.ts:80-130). Same in-the-moment thread, two saved transcripts on /transcripts. Pick the persistence model that matches how you actually use the rail.

Current

Option A · Always split

Voice + text live in separate conversation rows. (status quo)

Every voice session creates its own Voice session — HH:MM row. Text conversations stay text. Rail merges them in-memory for the current session; transcripts page shows two threads.

Preview · /transcripts list

Email triage backlog12 turns · text · 09:42

Voice session — 09:483 turns · voice · 09:48

Voice session — 14:215 turns · voice · 14:21

Pros

Already shipped — zero work.
Clear semantic boundary: "this thread was voice."
Voice sessions are easy to filter / delete in bulk.

Cons

Rail-as-scratchpad mental model breaks at the persistence boundary.
Same conversation, two transcripts — finding "what we discussed" is split.
Voice-session titles (HH:MM) are uninformative.

Risk n/a Effort 0 Reversible trivially

My pick

Option B · Always merge

Voice persistence inherits the rail's active text conversation when one exists.

If you've been typing in the rail and then hold Space, the voice turns append to the same conversation row. Each turn carries a channel badge (text / voice) so the transcript reader can show mode shifts. Cold-start voice (no active text) still creates a Voice session — HH:MM as today.

Preview · unified transcript with channel badges

You

TextWhat's left on the email triage queue?

Shakeeb

TextFour threads need a reply. Two from Daisy, one from Stripe…

— held Space at 09:48 —

You

VoiceDraft the Daisy ones, leave Stripe for me.

Shakeeb

VoiceTwo drafts ready in your Drafts folder.

Pros

Mental model matches how the rail feels in the moment.
Searching "Daisy emails" finds the whole thread, not half of it.
Channel badges keep the audit trail honest.

Cons

Schema work: atlas_messages.channel column + UI badge.
Conversation titles can drift if voice goes off-topic mid-thread.
Voice-only filter on /transcripts becomes "filter by channel," not by row.

Risk low Effort ~2–3 hours · 1 PG migration Reversible yes

Option C · Time-windowed merge

Merge if the last text turn was within 10 minutes; otherwise split.

Keeps a "session" feeling without forcing every voice burst to inherit a stale text conversation. Same channel-badge mechanic as B; the difference is the join condition.

Preview · join window = 10 min

Email triage backloglast text 09:42 · voice at 09:48 → MERGED (6 min)

Voice session — 14:21last text 09:48 · voice at 14:21 → SPLIT (273 min)

Pros

Matches "topic-arc" feel without manual conversation switching.
Stops a 3pm voice burst from inheriting a 9am text title.

Cons

"Why did this voice turn merge but not the next one?" is hard to explain.
Window threshold is a hidden magic number — tuning is forever.
Edge cases: device sleep, background tab, etc. throw off the timer.

Risk medium Effort ~4 hours Reversible yes

F5 · Inert-write confirmation policy Low / Med

Today text chat + voice can fire create_task, draft_reply, extract_commitments, propose_*, sync_google_tasks_now without a confirmation modal — they're classified as inert writes (no external send). External Calendar writes already require an approval token. Pick the policy for the inert middle. Quote from the audit: "User says 'I should probably call Sarah sometime' and the model over-eagerly creates a task."

Current

Option A · Status quo

Inert writes fire immediately, no UI confirmation.

Trust the agent on internal mutations. External writes (Calendar create / update / delete) stay token-gated as today. Drafts, tasks, commitments, propose-only writes happen without a modal.

Preview · no confirmation surface

You

I should probably call Sarah sometime.

Shakeeb

Created task: Call Sarah. ✓

Pros

Fastest path — agent feels useful, not paranoid.
External writes already have the strong gate.
Zero work.

Cons

Off-handed remarks become tasks / drafts you have to clean up.
No visible "I just did X" surface — easy to lose track of what changed.
Trust contract feels asymmetric (Calendar gated; tasks not).

Risk ongoing surprise Effort 0 Reversible n/a

My pick

Option B · Undo toast

Fire immediately, then surface an 8-second undo banner.

Inert write happens; toast appears bottom-right with the action label and an Undo button. Click within 8s rolls it back. After 8s the toast dismisses. Keeps the trusted-fast path; gives a visible "what just happened" surface; fixes the ghost-task surprise without a modal.

Preview · undo surface, 8s timer

✓

Created task "Call Sarah" · no due date · personal list

7s

Undo

Pros

Doesn't slow down the trusted path.
Visible audit trail for every inert mutation.
One toast component, reusable across all inert tools.
"Undo" is the right primitive for reversible writes — modal is overkill.

Cons

Toasts in fast bursts can stack — needs queue + rate-limit.
Each tool needs an undo handler (delete the task / discard the draft / drop the commitment).
If you miss the 8s window you're back to manual cleanup.

Risk low Effort ~3–4 hours · 1 toast component + N undo handlers Reversible yes

Option C · Per-tool toggle

Settings toggles per inert tool — defaults on, opt-out for sensitive ones.

auto_create_tasks, auto_draft_replies, auto_extract_commitments in Settings. When off, the tool requires a UI confirmation. When on (default), it fires immediately. Power user gets full control; ghost-task fix requires the user to know to flip the toggle.

Preview · Settings → Agent behavior

Auto-create tasksWhen off, Shakeeb asks before creating tasks from chat.

Auto-draft repliesWhen off, Shakeeb asks before drafting email replies.

Auto-extract commitmentsWhen off, Shakeeb asks before logging commitments from conversations.

Pros

User has full control over agent behavior.
Sensitive tools (commitments) can ship default-off.
Survives across sessions — set once, done.

Cons

Settings page bloats with binary toggles.
"I forgot which tools are on" — opaque, unlike a toast that shows up live.
Default-on means the surprise still happens until you discover the toggle.

Risk low Effort ~2 hours · settings rows + tool dispatcher gates Reversible yes

F3 · War Room placeholders Medium

Connect every HUD module to a real endpoint this week.

Mark every placeholder card with a "DEMO DATA" pill outside ?demo=1 mode.

Don't render placeholder cards at all unless ?demo=1.

F4 · Mixed text+voice transcript model Medium

Voice + text live in separate conversation rows. (status quo)

Voice persistence inherits the rail's active text conversation when one exists.

Merge if the last text turn was within 10 minutes; otherwise split.

F5 · Inert-write confirmation policy Low / Med

Inert writes fire immediately, no UI confirmation.

Fire immediately, then surface an 8-second undo banner.

Settings toggles per inert tool — defaults on, opt-out for sensitive ones.

Mark every placeholder card with a "DEMO DATA" pill outside `?demo=1` mode.

Don't render placeholder cards at all unless `?demo=1`.