← Back to all projects
In Progress Created 2026-06-17 0/33 tasks

Telegram Digest Loop-Closer + Tier 1 Gate Relax

Priority: P2 Category: PREP


Executive Summary

The AI intelligence agent has a working preview stage (Mon/Thu digests) but the implement and eval loops are dormant — tier1_apply.py has run 0 times in 4 days, feedback.jsonl has 1 entry total. This plan wires the Telegram bot to be the loop-closer: reaction buttons on each digest finding write verdicts to feedback.jsonl (closes eval), and the Tier 1 source gate is widened so Cole-curated X threads can also auto-apply (closes implement). Together these convert the intel agent from "fancy reading material" into a learning system that produces measurable doc/skill changes per week.


Research Phase

Current State

Pipeline (~/ai-projects-local/mission-control/scripts/ai_intelligence/):

  • scrape_*.py runs 2×/day via LaunchAgent → drops candidates in data/ai-intelligence/incoming/
  • score.py → canonicalizes, scores (Haiku), merges to finds.jsonl (currently 110 finds)
  • tier1_apply.py runs after each scrape, hard-gated on is_official(source) AND score >= 80
  • digest.py runs Mon/Thu → assembles digest, includes any X threads from browser-agent/threads/pending/, sends via telegram.py
  • feedback.py accepts verdicts and can dispatch a find to MC daemon queue

Telegram bot (~/ai-projects-local/mission-control/scripts/telegram_watcher.py):

  • Long-polling worker, persistent state in data/telegram_watcher_state.json
  • Permission tiers via daemon/users.py (admin/operator)
  • Already saves X-thread URLs to browser-agent/threads/pending/ via Playwright
  • Has zero callback_query / inline keyboard support — entirely text-message based today

Loop status:

  • ✅ Preview firing (last digest 2026-06-15, scrape this morning 06:44)
  • ❌ Tier 1 auto-apply: last run logged tier1: no eligible finds; tier1_log.jsonl does not exist
  • ❌ Eval: feedback.jsonl has 1 row (2026-06-12) across 4 days of operation
  • ⚠️ Tier 3 dispatch: works in code, but MC approval queue is 49 deep and daemon idle

Historical Context

  • Old claude_intelligence.sh (daily 8am) did Tier 1 auto-apply directly; the new flow split this out into tier1_apply.py with a stricter gate.
  • Telegram bot was originally built to chat with Claude; X-thread queue was bolted on May 2026 (per feedback_telegram_xthread_routing.md).
  • Cole's stated frustration (this conversation): "are we actually implementing these things, or are we just reading through the thread and calling it good?"

Constraints

  • Bot is single-process long-poll. Adding callback handlers must not break message flow (~30s typing-indicator loop must keep working).
  • Permission tier. Only admin (Cole) should be able to dispatch / approve via Telegram. Operator (Caleb) can react but verdicts logged separately.
  • No new MCP serversurllib HTTP calls to Telegram API match existing style.
  • feedback.jsonl schema is fixed by feedback.py (ts, find_id, verdict, title, score, lens). Keep compatibility.
  • Tier 1 gate change is reversible — must include kill switch / rollback to old gate.
  • Telegram inline keyboard message length cap 4096 chars; per-button text 64 chars.

Options Considered

Option 1: Polling-only digest cards (chosen for #1)

Approach: digest.py sends one Telegram message per top-K find (K=5-10) with inline keyboard buttons (Done / Dispatch / Save / Dismiss). telegram_watcher.py adds a callback_query branch in its long-poll loop that writes verdicts to feedback.jsonl and answers the callback with a confirmation toast.

Pros:

  • Reuses existing bot process (no new daemon)
  • Inline keyboards are first-class Telegram primitive — no webhook required
  • Each finding becomes a discrete decision the user can knock out from phone
  • Symmetric with how MC dashboard already shows finds

Cons:

  • Adds callback_query handling to a code path that's only handled text messages
  • 5-10 messages per digest is moderate notification noise (mitigation: send as one digest message with a "Review →" button that lands on the card sequence)

Estimated Effort: Medium (3-4 hours)

Option 2: Single threaded digest message + reply-with-command verdicts

Approach: Digest sends one summary message. Cole replies done F-0617-3, dispatch F-0617-5, etc. Bot parses replies as text.

Pros: No callback_query plumbing; pure text. Fastest to build. Cons: Friction. Typing find IDs on phone is exactly the kind of micro-toil that kills loop closure. The whole point is one-tap verdicts. Defeats the goal.

Estimated Effort: Low (1-2 hours)

Option 3: Web dashboard mobile review only

Approach: Don't touch Telegram. Add a /digest/review page to MC dashboard, send a Telegram link to it on Mon/Thu.

Pros: Richer UI, more buttons possible. Cons: Opens browser, requires SSO/VPN to dashboard from phone (Cole has noted login friction). Defeats the "while watching TV" goal. Same problem as why MC queue sits at 49.

Estimated Effort: Medium (4-5 hours, more if auth needed)


Tier 1 gate options (#2)

Option A: Add cole_curated source class (chosen)

Approach: Anything in browser-agent/threads/pending/ (i.e. Cole manually queued via Telegram) gets tagged source_class: "cole_curated" in score.py. tier1_apply.py gate becomes (is_official(source) OR source_class == "cole_curated") AND score >= 75.

Pros:

  • Adds a provenance tier that reflects reality (Cole already vetted these by queueing)
  • Lower threshold (75 vs 80) reflects that human curation reduces false positives
  • Easy to revert (one boolean flip in gate)

Cons:

  • Expands what auto-edits CLAUDE.md/howto.md — slightly larger blast radius
  • Requires schema bump on finds (source_class field)

Estimated Effort: Low (1 hour)

Option B: Drop threshold to 70, keep official-only

Approach: Just lower TIER1_THRESHOLD from 80 → 70.

Pros: One-line change. Cons: Doesn't actually solve the X-thread problem (X is still not official-domain). Just lets borderline official-source finds through. Likely raises false positives without surfacing the highest-signal source.

Estimated Effort: Trivial

Option C: Manual /promote F-XXXX Telegram command

Approach: Bot command that promotes a single find to Tier 1 eligibility regardless of source.

Pros: Explicit, surgical. Cons: Adds another manual step every Mon/Thu — defeats the "auto" in auto-apply. Better used as escape hatch for Option A's misses.

Estimated Effort: Low


Chosen Approach

Decision: Option 1 + Option A. Optionally layer Option C later as escape hatch.

Rationale:

  1. Friction is the actual enemy. The reason feedback.jsonl has 1 row is that there's no zero-friction input surface. Inline keyboard buttons in Telegram are the only mobile UI that survives the "watching TV on couch" test.
  2. Cole-curated source class encodes a real signal. When Cole queues an X thread, he's already done provenance + relevance filtering by hand. The gate should reflect that. Treating his curated picks identically to a random X scrape is what's blocking Tier 1 from firing.
  3. Both changes are reversible. Keyboard buttons additive — they don't change text-message flow. Gate change is one conditional, kill-switched via tier1_apply.py --strict flag.
  4. Loop completeness > feature breadth. Cole's question is "is this robust?" Building #1+#2 closes implement and eval loops simultaneously. Other proposed upgrades (#3-#5 from chat: tier1 log, /approvals, /status) all assume these two work first.

Trade-offs Accepted:

  • Slightly larger blast radius for Tier 1 auto-edits (mitigated by git commit per run + tier1_log.jsonl review trail).
  • Bot now does both text-chat AND callback handling — modest complexity increase in the long-poll loop.
  • Caleb's verdicts logged but not acted on (separate verdict file) — adds future work if/when his judgment should count.

Implementation Plan

Phase 1: Tier 1 Gate Relax (the unblock)

File: ~/ai-projects-local/mission-control/scripts/ai_intelligence/score.py

  • Add source_class field when canonicalizing: "cole_curated" if URL appears in any browser-agent/threads/pending/ or done/ .md file, else "official" if is_official(source), else "community".
  • Persist source_class in finds.jsonl rows.

File: ~/ai-projects-local/mission-control/scripts/ai_intelligence/tier1_apply.py

  • Add CLI flag --strict to fall back to old gate (official + 80) for rollback.
  • Update gate: (source_class in ("official", "cole_curated")) AND score >= 75.
  • Add --dry-run output that prints which finds would now pass vs old gate (diff for sanity check).
  • Bump TIER1_LOG schema to include source_class per applied entry.

Verification before merge:

  • Run tier1_apply.py --dry-run against current finds.jsonl — should show ≥1 cole_curated find newly eligible.
  • Run with --strict — should match current "no eligible finds" behavior.

Phase 2: Telegram Reaction Loop

File: ~/ai-projects-local/mission-control/scripts/ai_intelligence/digest.py

  • After digest send, call new function send_digest_cards(top_finds) that sends N (default 5, configurable) inline-keyboard messages — one per find — with buttons: ✅ Done, 🚀 Dispatch, 📌 Later, 🗑 Dismiss.
  • Each button's callback_data encodes verdict:find_id (under 64 bytes — find_ids are F-0617-3-style, fits easily).
  • Card text includes: title, source, score, 1-line summary, link.
  • Wrap in feature flag env var TELEGRAM_DIGEST_CARDS=1 so we can disable without code revert.

File: ~/ai-projects-local/mission-control/scripts/telegram_watcher.py

  • In getUpdates loop, branch on callback_query in addition to message.
  • Add handle_callback_query(update):
  • Parse verdict:find_id from callback_data.
  • Permission check — only admin tier can dispatch; operator verdicts logged separately to feedback_operator.jsonl.
  • For verdict = dispatch: shell out to python3 ai_intelligence/feedback.py --dispatch F-XXXX-N.
  • For verdict in (done, later, dismiss): append row to feedback.jsonl via shared helper.
  • Answer the callback with answerCallbackQuery (toast: "Logged: ✅ done").
  • Edit the original message to strike-through the buttons (reply_markup: empty) so cards don't accept re-clicks.
  • Save callback_query.message.message_id keyed by find_id in state file so we can edit later if needed.

File: ~/ai-projects-local/mission-control/scripts/ai_intelligence/feedback.py

  • Expose a log_verdict(find_id, verdict, source="telegram", user="cole") helper that telegram_watcher can import (not just call via subprocess) — avoids spawning a python process per tap.

Phase 3: Observability + Rollout

  • Add tier1_log.jsonl viewer: simple cat | jq recipe in PLAN.md / README.md so Cole can inspect from phone-friendly /tier1 log later (Phase 4 if/when built).
  • Add one-line stat to next digest: "Last 7d: X auto-applied, Y verdicts logged" — proves the loop is closing.
  • Update agent-council.md after first successful auto-apply with the find_id + commit SHA.
  • Run digest manually with cards enabled, confirm Cole gets 5 buttons that respond on tap.
  • Enable on Mon/Thu schedule after one clean manual run.

Phase 4 (deferred — explicit non-goals here)

  • /tier1 log, /approvals, /approve N, /reject N, /status — covered in chat scope but separate plan after #1+#2 prove out.
  • Voice → DISPATCH task — separate plan.
  • /promote F-XXXX escape hatch — only if Option A misses surface in week 1.

Acceptance Criteria

Must Have:

  • Tier 1 gate accepts cole_curated source class; --dry-run shows ≥1 newly eligible find from current finds.jsonl.
  • One full scrape cycle results in ≥1 entry in tier1_log.jsonl (i.e. it actually fires).
  • One Mon/Thu digest produces N inline-keyboard cards in Telegram.
  • Tapping any of (Done, Later, Dismiss) writes a row to feedback.jsonl with correct find_id, verdict, ts, user.
  • Tapping Dispatch creates an MC daemon task via existing feedback.py --dispatch code path.
  • Operator (Caleb) taps log to feedback_operator.jsonl, not feedback.jsonl.
  • Buttons strike through after tap (no double-vote).
  • Feature flag TELEGRAM_DIGEST_CARDS=0 reverts to old text-only digest cleanly.

Should Have:

  • First post-build digest shows "Last 7d: X auto-applied, Y verdicts logged" footer.
  • Each auto-applied Tier 1 edit lands as a single commit in ~/ai-projects-local/docs (audit trail).

Nice to Have:

  • Cards include score delta vs previous digest (trend signal).
  • After 3 consecutive "Dismiss" verdicts on same source, source is auto-demoted in scoring.

Verification Steps

  1. Tier 1 dry-run sanity check: python3 ~/ai-projects-local/mission-control/scripts/ai_intelligence/tier1_apply.py --dry-run python3 ~/ai-projects-local/mission-control/scripts/ai_intelligence/tier1_apply.py --dry-run --strict Confirm new gate surfaces cole_curated finds; strict matches today's "no eligible finds".

  2. Reaction smoke test (one find, one user): - Manually invoke digest.py with TELEGRAM_DIGEST_CARDS=1 and top-K=1. - Receive card on Telegram, tap each button (run 4 times with 4 different finds), verify each writes to feedback.jsonl. - Confirm callback answer toast appears within 1s.

  3. End-to-end real run: - Wait for next Mon/Thu digest (or trigger manually). - Confirm cards arrive, react to all 5, check feedback.jsonl has 5 fresh rows. - Confirm one Tier 1 auto-apply happens within 24h (next scrape cycle picks up Cole's curated thread).

  4. Rollback path: - export TELEGRAM_DIGEST_CARDS=0 reverts digest to text-only. - --strict flag on tier1_apply reverts to old gate. - Both should leave feedback.jsonl and tier1_log.jsonl untouched.


Execution Log

2026-06-17 — Planning

Plan drafted from conversation audit. Source-of-truth findings:

  • tier1_apply.py last logged tier1: no eligible finds at 2026-06-16 06:44.
  • feedback.jsonl = 1 row (2026-06-12), tier1_log.jsonl doesn't exist.
  • Telegram bot has 0 callback_query support today.

Next steps: Opus plan review (per CLAUDE.md), then /grill-me, then execute Phase 1 (Tier 1 gate) before Phase 2 (Telegram cards) — gate change is lower-risk and unblocks measurable Tier 1 activity even before cards ship.


Lessons Learned

[Add after completion]

What Worked:

What Didn't:

Next Time:


References

  • ~/ai-projects-local/mission-control/scripts/ai_intelligence/ — new AI intelligence flow (built 2026-06-12)
  • ~/ai-projects-local/mission-control/scripts/telegram_watcher.py — main bot, target for callback_query branch
  • ~/ai-projects-local/mission-control/scripts/ai_intelligence/PLAN.md — original plan of record for the intelligence flow
  • ~/ai-projects/mission-control/data/ai-intelligence/ — runtime data (finds, feedback, state)
  • Conversation 2026-06-17 (this session) — audit that revealed implement + eval loops dormant
  • Telegram Bot API → sendMessage reply_markup.inline_keyboard, callback_query, answerCallbackQuery