Open · Needs decisions

Shakeeb Evolution — three decisions before code

Phase 1.7 adds the reflection engine (option 1: nightly belief formation) and the self-critique loop (option 2: draft → critique → revise on every Shakeeb response). Persistent personalization (option 3) is scheduled as a follow-on once reflection produces 2+ weeks of beliefs.

Three decisions need your call below before I scope the implementation. Each section has 4–5 options with mechanics, visual previews, and pros/cons. My pick is marked.

2026-04-27 · audits/design-options-shakeeb-evolution-2026-04-27.html

Decision 1

Reflection cadence — when does the engine run?

Your direction: presence-aware, range-based, smart — not a dumb fixed-clock cron. The engine pulls 24h of conversations + approvals + drafts, distills them into beliefs, proposes memory consolidations, and writes a daily summary. The question is what triggers the run.

Below is the full design space, from dumbest baseline to most ambitious. Each option's visual preview shows a sample week — gray cells = you're logged in, dots = reflection ran.

Option A
Fixed nightly cron — 11pm local
Baseline. Runs every day at 11pm regardless of presence. Simplest possible engine.
Logged
M
T
W
T
F
S
S
Run
Logged in that day Reflection fired

Pros

  • 2 days to ship; one cron, one service.
  • Fully predictable — you always know when it ran.

Cons

  • Reflects on empty days (waste of tokens).
  • Fires while you're mid-conversation if you're a night owl.
  • Not "smart" — your stated requirement.
Build: 2 days Cost: $1/day fixed Risk: low
Option B
Presence window + idle detection
Defines a window (e.g., 10pm–2am). Triggers when you're logged in inside the window AND idle 5+ minutes. Your initial proposal, refined.
Logged
M
T
W
T
F
S
S
Run
Logged in inside window Reflection fired (idle) Skipped (no login in window)

Pros

  • Respects presence — no firing while you're away.
  • Idle gate prevents interrupting active work.
  • Maps exactly to your stated direction.

Cons

  • Window is still arbitrary — what about night-owl sessions past 2am?
  • Skip days accumulate forever (no fallback).
  • Doesn't scale fire frequency to actual activity volume.
Build: 4 days Cost: $0–1/day Risk: low
Option C
Activity-volume adaptive — no clock
No fixed time. Tracks "reflection debt" — accumulating activity score (emails triaged, conversations had, commitments made). Crosses threshold + you're idle → reflect.
Logged
M
T
W
T
F
S
S
Run
Logged in Debt threshold crossed → fire Below threshold → skip

Pros

  • Scales fire frequency to actual signal volume.
  • Quiet days never burn tokens.
  • Heavy days might trigger same evening.

Cons

  • Unpredictable — you don't know when reflection happened.
  • Threshold tuning is hard; needs telemetry to calibrate.
  • Slow accumulation periods get stale beliefs.
Build: 5–6 days Cost: variable, $0–2/day Risk: medium
My Pick
Option D
Hybrid — learned window + activity floor + skip cap
Combines presence-awareness, learned windows, activity-aware scaling, and a hard cap on skip days. The smart answer, with explicit knobs.
Logged
M
T
W
T
F
S
S
Run
Logged in window Full reflection Forced (skip-cap) Below activity floor Not logged in
Mechanics
  1. Learned window: first 14 days, observes when you typically wind down (logout time, idle-then-resume gaps). Settles on a per-user range, e.g. 10:30pm–1:30am. User can override.
  2. Trigger conditions (all must be true):
    • Inside the learned window.
    • Idle 3+ minutes (no input, no agent activity).
    • Activity floor met: ≥ 3 conversations OR ≥ 8 emails triaged OR ≥ 1 commitment recorded today.
  3. Skip cap: if 3 days pass without firing (whether quiet or absent), forces a multi-day reflection on next login regardless of window.
  4. Manual trigger: always available via Cmd+K → "Reflect on today now". Immediate, ignores all gates.
  5. Cost ceiling: hard $1.50/day cap on reflection tokens (tied to existing $20/day master cap).
  6. Audit: every fire writes to atlas_reflections with trigger reason (window+idle+floor / forced-after-skip / manual).
Window: learned, default 10:30pm–1:30am Idle threshold: 180s Activity floor: 3 conv ∨ 8 email ∨ 1 commit Skip cap: 3 days

Pros

  • Smart in your sense — adapts to your actual patterns, not a guess.
  • Activity floor stops empty-day waste.
  • Skip cap stops belief drift on quiet weeks.
  • Manual trigger gives you a release valve.
  • All knobs are explicit — no black-box behavior.

Cons

  • Most complex of the options — 4 trigger inputs to test.
  • Learned window needs a 14-day cold-start period.
  • If knobs are wrong, behavior feels random.
Build: 7–8 days Cost: $0.50–$1.50/day Risk: medium (knob tuning)
Option E
Continuous micro-reflection + weekly deep
No batched cadence at all. After every conversation closes, a tiny background pass updates beliefs incrementally. Weekly "deep" reflection consolidates the micro-updates.
Logged
M
T
W
T
F
S
S
Run
Micro-reflection (per conversation) Weekly deep consolidation

Pros

  • Beliefs always fresh — real-time feel.
  • Closest match to how human memory actually works.
  • No cold-start period.

Cons

  • ~10× cost of the batched options ($5–10/day).
  • Approval queue grows after every conversation — overwhelming.
  • Hard to debug — beliefs change between sessions for unclear reasons.
Build: 12+ days Cost: $5–10/day Risk: high (cost + approval flood)
Decision 2 · advise me

Auto-approve threshold — what mutates without your sign-off?

The reflection engine produces three classes of mutation: memory consolidations (merge duplicates, retag), memory prunes (delete), and opinion updates (beliefs about you, your preferences, your stances). The draft-first contract is non-negotiable — but applying it to every single edit creates an approval queue that piles up and you stop reading.

My recommendation: Option C — Tiered by mutation type. It matches risk to friction, doesn't depend on shaky LLM self-confidence scores, and matches your existing asymmetric trust pattern (Calendar PATCH/DELETE got the same treatment in Patch 44).

Option A
Strict draft-first on every edit
Every consolidation, prune, opinion change goes to the queue. Zero auto-apply.
Pending
Merge 4 duplicate emails about Q2 budget review
consolidation · 2 min ago
Approve / Reject
Pending
Retag 12 memories from work/generalwork/founders
consolidation · 5 min ago
Approve / Reject
Pending
Update opinion: prefers terse replies on internal threads
opinion · 8 min ago
Approve / Reject
Pending
Delete 3 stale entities (no references in 90+ days)
prune · 12 min ago
Approve / Reject

Pros

  • Zero trust violation surface.
  • You always know exactly what changed.

Cons

  • Queue piles up fast (10–30 items/day).
  • You'll start ignoring it after week 2.
  • Friction kills the engine's value.
Friction: high Trust risk: zero
Option B
Confidence-graded auto-approve
Each proposal gets a self-assessed confidence score (0–1). Above 0.85 = auto. Below = queue.
Auto · 0.94
Merge 4 duplicate emails
consolidation · auto-applied
Audit log →
Auto · 0.91
Retag 12 memories
consolidation · auto-applied
Audit log →
Pending · 0.72
Update opinion: prefers terse replies
opinion · pending review
Approve / Reject

Pros

  • Queue stays small — only the borderline cases.
  • Trust grows as the model gets calibrated.

Cons

  • LLM self-confidence is famously unreliable — false confidence on wrong edits.
  • No structural protection on sensitive mutations (opinions about you).
  • Calibration takes weeks; early period is risky.
Friction: medium Trust risk: medium-high
My Pick
Option C
Tiered by mutation type
Auto-apply low-risk ops with audit. Always queue high-risk ops. No model self-assessment needed — rules are structural.
Auto
Merge 4 duplicate emails about Q2 budget review
consolidation · auto · audit logged
Revoke →
Auto
Retag 12 memories from work/general → work/founders
tagging · auto · audit logged
Revoke →
Queue
Update opinion: prefers terse replies on internal threads
opinion edit · always queued
Approve / Reject
Queue
Delete 3 stale entities (no refs 90+ days)
prune · always queued
Approve / Reject
Tiers
  1. Auto-apply (with audit + revoke): deduplication, retagging, embedding refresh, fact-extraction-from-conversation. All reversible.
  2. Always queue: opinion creation/update (beliefs about you), memory prune/delete, entity merge across people, anything touching atlas_opinions or atlas_entities identity columns.
  3. Audit log: Memory Browser gets a "Recent auto-edits" tab with one-click revoke (reverses the op + adds a "do not auto" rule for the pattern).

Pros

  • Predictable rules — no surprise behavior.
  • Matches your existing asymmetric trust pattern (Calendar Patch 44).
  • Queue stays small (only opinions + prunes), but high-signal items.
  • Revoke gives you safety net on auto-applied ops.
  • Doesn't depend on LLM self-confidence.

Cons

  • Need to define and maintain the tier list.
  • Edge cases (e.g., consolidation that touches an opinion) need explicit routing.
Friction: low Trust risk: low Build delta: +1 day vs A
Option D
Trust ramp — strict → user-promoted
Week 1–2: strict draft-first on everything. Week 3+: you can promote categories to auto based on rejection rate.
Auto · promoted
Consolidations (you've approved 47/47 this category)
promoted to auto on day 17
Demote →
Queue · strict
Opinion edits (still in observation period)
12/14 approved · 2 more to promote
Approve / Reject

Pros

  • Progressive trust — you decide what's safe.
  • Behavior is data-driven from your actual approvals.

Cons

  • Behavior changes over time → confusing.
  • Heavy first 2 weeks of friction kills initial adoption.
  • Promotion logic is one more system to build + debug.
Friction: high → low (over time) Trust risk: low
Decision 3

Critique trail visibility — you were right, this is a transcript question

Self-critique loop: Shakeeb generates a draft → critiques it against a checklist (factual? tone-matched? respects opinions? draft-first compliant?) → revises → returns. The intermediate draft and critique reasoning are artifacts — the question is whether to persist them and where.

Your pushback was correct: "doesn't this just go in transcript?" — yes. The four options below are all variations on how it surfaces in (or near) the transcript. None is "build a separate timeline UI."
Option A
Transient — final answer only
The critique runs, improves the answer, then gets discarded. Nothing in the transcript except the final reply.
You
draft a reply to Sarah about Thursday's call
Shakeeb
Drafted in your Inbox — proposes Thursday 2pm, references the budget thread, signs off in your usual style. Ready when you are.

Pros

  • Cleanest UI — no extra affordances.
  • Zero storage cost.
  • Critique stays a pure quality-improvement step.

Cons

  • If a response surprises you, no way to see why Shakeeb said what he said.
  • Loses signal that could improve the system.
UI: none Storage: none
My Pick
Option B
Inline collapsed beneath the message
Like Claude's extended thinking. Final answer is what you see; one click reveals the critique trail. Default: collapsed.
You
draft a reply to Sarah about Thursday's call
Shakeeb
Drafted in your Inbox — proposes Thursday 2pm, references the budget thread, signs off in your usual style. Ready when you are.
Show reasoning trail · 3 critique passes
Pass 1 draft: proposed Thursday morning. Self-critique flagged: conflicts with Atlas standup at 10am — check calendar.
Pass 2 draft: proposed Thursday 2pm. Self-critique flagged: tone too formal for Sarah — past 14 emails average 22 words, current draft is 38.
Final: tightened to 19 words, kept the 2pm proposal.

Pros

  • Lives where the user already looks — the chat itself.
  • Hidden by default → no clutter for normal use.
  • Available when something feels off → instant explanation.
  • Familiar UX (Claude thinking, ChatGPT reasoning).

Cons

  • Storage grows with every Shakeeb response (~1KB each).
  • Critique trail is an artifact users can argue with — needs care in copy.
UI: 1 collapsible block Storage: ~1KB / response Build: +0.5 day
Option C
Settings toggle — off by default
Same as B, but the "Show reasoning" affordance only appears if you've turned on "Show critique trails" in Settings.
You
draft a reply to Sarah
Shakeeb
Drafted in your Inbox — proposes Thursday 2pm, references the budget thread, signs off in your usual style.
Settings → Conversation → ☐ Show critique trails (off by default)

Pros

  • Cleanest default UI for casual use.
  • Power-user opt-in available.

Cons

  • Most users never find the toggle.
  • Trail is invisible exactly when you'd want it (the "huh, why did Shakeeb say that?" moment).
UI: toggle + collapsible Discoverability: low
Option D
Separate audit log — debug menu only
Critique trails persisted to a debug audit panel. Never inline. For diagnosing model behavior, not casual use.
Debug → Critique Audit
Conversation #4421 · message 12 · 3 critique passes · final delta: +12% conciseness, +1 calendar conflict caught
Conversation #4420 · message 8 · 1 critique pass · final delta: tone match improved (formal→casual)

Pros

  • Zero impact on normal chat UI.
  • Useful for me when debugging Shakeeb's behavior.

Cons

  • You'll never look at the debug panel.
  • Wastes the trail's main user value (in-context explanation).
  • "Doesn't this just go in transcript" — no, it doesn't, defeating your point.
UI: separate panel Discoverability: ~zero

Your picks

Reply with the letter for each decision. My recommendations: cadence-D, autoapprove-C, critique-B.

If you agree with all three, just say "agree with all" and I'll start scoping the implementation (ADR for cadence, contracts + tests for the reflection service, agent-runtime patch for the critique loop).

If you want to override any, reply with e.g. cadence-D, autoapprove-C, critique-A and I'll execute that combo.

cadence-_, autoapprove-_, critique-_