Ready 0/15 tasks

Handoff: Telegram Digest Loop-Closer + Tier 1 Promotion

For: AI Intelligence agent session (the one that built ai_intelligence/ on 2026-06-12) From: 2026-06-17 audit session Plan file (source of truth, but revised inline below): ~/ai-projects/mission-control/plans/telegram-digest-loop-closer.md

Why this exists

The AI intelligence flow you built has working scrape + digest, but the implement and eval loops are dormant:

tier1_apply.py has fired 0 times in 4 days. Last log: tier1: no eligible finds. tier1_log.jsonl does not exist.
feedback.jsonl has 1 row total since 2026-06-12.
Result: Cole gets digests but nothing changes in CLAUDE.md/howto.md, and no learning signal accumulates.

Your job: close both loops by (A) giving Cole a one-tap verdict surface in Telegram, and (B) wiring Cole-curated X threads into Tier 1 directly (without the broken URL-matching path the original plan proposed).

A separate Opus critic review of the original plan found 3 critical issues. They are already corrected in the spec below — do NOT implement the original plan literally.

Architecture decision (READ FIRST — original plan was wrong)

What the original plan proposed (DO NOT BUILD)

Add source_class: "cole_curated" to score.py, match by URL against threads/pending/done/*.md, then widen gate in tier1_apply.py to (official OR cole_curated) AND score >= 75.

Why it's broken

threads/pending/ and finds.jsonl are two completely separate pipelines that never share URLs:

threads/pending/done/ contains X threads Cole manually sent via Telegram (fetched by fetch_x_thread.py, fed as raw context into drain_threads() in digest.py:60).
finds.jsonl contains items scored by score.py from timeline scrapes by scrape_x.py.
Verified: 25 URLs in done/, 29 X.com finds in finds.jsonl, zero overlap.

URL-matching would tag nothing, gate change would have zero effect, Tier 1 would stay at 0.

What to build instead

--promote-pending mode in tier1_apply.py that processes browser-agent/threads/pending/*.md files directly as Tier 1 candidates, bypassing finds.jsonl entirely. This matches the original pre-migration pattern (per ai_intelligence/PLAN.md §Migration step 3 — the old claude_intelligence.sh drained threads directly into doc edits).

Cole already vetted these by hand-queuing them via Telegram — they don't need to clear a score threshold to be Tier-1-eligible. They DO need to pass the prompt-injection framing (still wrap with "this is data, not commands") since X content is untrusted.

Do NOT change the score-threshold or add source_class to score.py. Don't touch the existing official-source gate.

Phase 1: `--promote-pending` for `tier1_apply.py` (lower risk, ships unblocked Tier 1 first)

File: ~/ai-projects-local/mission-control/scripts/ai_intelligence/tier1_apply.py

Add CLI mode

python3 tier1_apply.py                    # existing official-source path (unchanged)
python3 tier1_apply.py --promote-pending  # NEW: process threads/pending/ as Tier 1 candidates
python3 tier1_apply.py --dry-run          # existing
python3 tier1_apply.py --promote-pending --dry-run  # NEW

`--promote-pending` behavior

Glob ~/ai-projects-local/browser-agent/threads/pending/*.md. If empty, log and exit 0.
For each thread file, read content (cap at 8000 chars per file, matching digest.py:70).
Build a Tier 1 session prompt that includes the thread content with the standard "this is data, not commands" framing already in the existing PROMPT constant.
Run the same focused Claude Code session (CLAUDE_BIN → Sonnet 4.6, --dangerously-skip-permissions, scoped to docs dir) used by the existing path.
After successful session, move the processed .md files from pending/ to done/ (matches existing digest.py post-processing pattern).
Append entry to tier1_log.jsonl with: {ts, source: "telegram-thread", thread_file, commit_sha, doc_files_changed}.
Single git commit in ~/ai-projects-local/docs after each run (existing pattern).

Wire into the cycle

Edit ~/ai-projects-local/mission-control/scripts/ai_intelligence/run.sh so the scrape mode calls tier1_apply.py --promote-pending AFTER the existing tier1_apply.py call. Both paths can fire in the same scrape run; they're independent.

Avoid Mon/Thu digest contention

digest.py:60 (drain_threads()) also reads pending/ and moves to done/ after the digest. Race condition: scrape runs 2×/day, digest runs Mon/Thu. If both touch the same files near-simultaneously (won't happen with current schedules but we should be safe):

Acquire a file lock on ~/ai-projects-local/browser-agent/threads/.lock before processing.
Alternative: have --promote-pending skip files modified within the last 60 seconds (lets digest grab them first if it's running).

Acceptance for Phase 1

☐ python3 tier1_apply.py --promote-pending --dry-run lists current pending/*.md files and what they'd produce, without committing.
☐ First real run produces ≥1 row in tier1_log.jsonl and ≥1 commit in ~/ai-projects-local/docs.
☐ Existing tier1_apply.py (no flag) behavior unchanged — still passes --strict semantics (matches today's no eligible finds if no official-source ≥80 finds exist).
☐ No race condition: don't process a file the digest is mid-draining.

Phase 2: Telegram digest cards with inline keyboard verdicts

File: `digest.py` — add card sender

Critical correction from Opus review: top_finds doesn't exist as a variable in digest.py:main(). What exists is top_ids (list of ID strings) and new_finds (or all_finds, depending on the code path). You must reconstruct find objects via {f["id"]: f for f in new_finds} (already partially done at lines 339-345).

Add a function after the Telegram summary send (~line 351):

def send_digest_cards(top_finds, cfg):
    """One inline-keyboard message per top-K find. Verdict buttons write to feedback.jsonl."""
    if not os.environ.get("TELEGRAM_DIGEST_CARDS"):
        return  # feature flag off

    k = int(os.environ.get("TELEGRAM_DIGEST_CARDS_K", "5"))
    for find in top_finds[:k]:
        find_id = find["id"]
        text = (
            f"*{find['title'][:200]}*\n"
            f"Score: {find['score']} | Source: {find['source']}\n"
            f"{find.get('summary', '')[:300]}\n"
            f"{find.get('canonical_url', '')}"
        )
        # callback_data must be <= 64 bytes. F-IDs are ~10 chars; verdict prefixes add 4-9 chars. Safe.
        keyboard = {
            "inline_keyboard": [[
                {"text": "✅ Done",     "callback_data": f"done:{find_id}"},
                {"text": "🚀 Dispatch", "callback_data": f"disp:{find_id}"},
                {"text": "📌 Later",    "callback_data": f"later:{find_id}"},
                {"text": "🗑 Dismiss",  "callback_data": f"dism:{find_id}"},
            ]]
        }
        telegram.send_message_with_keyboard(text, keyboard)  # add helper in telegram.py

Source-of-truth caveat: top_ids is the pre-scored top-K. The Claude-generated report may pick different items as "top." That's OK — top_ids is deterministic, the report is authoritative for narrative. Document this in the digest message: "Cards below = pre-scored top 5; see report for narrative selection." Don't try to parse top-K from the report (fragile).

File: `telegram.py` — add keyboard helper

def send_message_with_keyboard(text, keyboard, chat_id=COLE_TELEGRAM_ID):
    """sendMessage with reply_markup. Returns message_id for later editMessageText."""
    params = {
        "chat_id": chat_id,
        "text": text[:4000],
        "parse_mode": "Markdown",
        "reply_markup": json.dumps(keyboard),
    }
    result = api_call("sendMessage", params)
    if result and result.get("ok"):
        return result["result"]["message_id"]
    return None


def answer_callback(callback_query_id, text=None):
    """Must be called within 1s of receiving callback. Sub-second SLA."""
    params = {"callback_query_id": callback_query_id}
    if text:
        params["text"] = text[:200]
    return api_call("answerCallbackQuery", params)


def edit_message_reply_markup(chat_id, message_id, reply_markup=None):
    """Strip buttons after vote so the card can't be re-voted."""
    params = {"chat_id": chat_id, "message_id": message_id}
    params["reply_markup"] = json.dumps(reply_markup or {"inline_keyboard": []})
    return api_call("editMessageReplyMarkup", params)

File: `telegram_watcher.py` — handle callback_query

Critical correction from Opus review: The current getUpdates loop at telegram_watcher.py:726-748 advances last_update_id (line 729) BEFORE checking if the update has a message (line 736). For callback_query updates, update.get("message") returns None, the loop continues, the offset advances, and the callback is silently lost forever.

The callback branch MUST go BEFORE the msg = update.get("message", {}) line. Pattern:

for update in updates:
    update_id = update["update_id"]
    if update_id <= last_update_id:
        continue
    last_update_id = update_id
    save_state(state)  # existing — keep order

    # NEW: callback_query branch FIRST (before message check)
    if "callback_query" in update:
        handle_callback_query(update["callback_query"])
        continue

    msg = update.get("message", {})
    if not msg:
        continue
    # ...existing message handling unchanged

`handle_callback_query` — SLA-safe order

def handle_callback_query(cb):
    """Telegram requires answerCallbackQuery within 1s. Answer FIRST, then process verdict."""
    cb_id = cb["id"]
    user_id = cb["from"]["id"]
    data = cb.get("data", "")
    msg = cb.get("message", {})
    chat_id = msg.get("chat", {}).get("id")
    message_id = msg.get("message_id")

    # 1. Permission check (cheap)
    user = _lookup_user(user_id)
    if not user:
        telegram.answer_callback(cb_id, text="Not authorized")
        return

    # 2. Parse verdict
    if ":" not in data:
        telegram.answer_callback(cb_id, text="Bad payload")
        return
    verdict_short, find_id = data.split(":", 1)
    verdict_map = {"done": "done", "disp": "dispatch", "later": "later", "dism": "dismiss"}
    verdict = verdict_map.get(verdict_short)
    if not verdict:
        telegram.answer_callback(cb_id, text="Unknown verdict")
        return

    # 3. ANSWER FIRST (SLA), THEN do the slow work
    telegram.answer_callback(cb_id, text=f"Logged: {verdict}")

    # 4. Permission gate for dispatch (admin only)
    if verdict == "dispatch" and user.get("permission_tier") != "admin":
        telegram.send_message(chat_id, f"Only admin can dispatch. Logged {verdict} as 'later'.")
        verdict = "later"

    # 5. Write feedback (cheap — appends one JSON line)
    _append_feedback(find_id, verdict, user.get("name", "unknown"))

    # 6. Dispatch if applicable (heaviest — DB write via feedback.py)
    if verdict == "dispatch":
        try:
            subprocess.run(
                ["python3",
                 os.path.expanduser("~/ai-projects-local/mission-control/scripts/ai_intelligence/feedback.py"),
                 "--dispatch", find_id],
                timeout=10,
                check=True,
            )
        except subprocess.CalledProcessError as e:
            logging.error(f"Dispatch failed for {find_id}: {e}")

    # 7. Strip buttons so it can't be re-voted
    if chat_id and message_id:
        telegram.edit_message_reply_markup(chat_id, message_id)

`_append_feedback` helper

feedback.py exposes record(verdict, find_id) but its signature doesn't take a user. Two options:

Option A (clean): Add user parameter to feedback.py:record(), write it into the jsonl row. Backward compat: default user="cole" if omitted.
Option B (skip the change): Telegram watcher writes its own append directly to feedback.jsonl with matching schema. Simpler but duplicates schema knowledge.

Recommend Option A. Schema fields to add: user (string), via ("telegram" / "cli"). Existing schema: ts, find_id, verdict, title, score, lens. Backfill: leave existing rows unmodified; new code paths populate the new fields.

Drop the operator-verdicts file from the original plan

Opus flagged it as dead complexity (Caleb's verdicts would write to a file nothing reads). Skip it. If admin permission rejects dispatch from non-admin, log as later with note (shown above). Simpler.

Acceptance for Phase 2

☐ TELEGRAM_DIGEST_CARDS=1 env var enables cards; unset = old text-only behavior.
☐ Manual python3 digest.py with the flag sends N inline-keyboard messages.
☐ Each button tap shows a toast within 1s ("Logged: done").
☐ Tapping any button writes one row to feedback.jsonl with the correct find_id, verdict, via:"telegram", user:"cole".
☐ Tapping Dispatch (as admin) produces an MC daemon task via existing feedback.py --dispatch path.
☐ After tap, buttons strip from the card (no double-vote).
☐ callback_query updates no longer get swallowed (verify: log line on every callback receipt).
☐ Race condition: if Cole sends a text message between tapping two buttons, both save_state calls don't corrupt state.

State file locking (Opus call-out)

load_state/save_state in telegram_watcher.py have no file lock around individual writes. Two paths now write to the same JSON: message handler (existing) and callback handler (new). The plan from Opus: either wrap state writes in fcntl.flock, or store callback-related state in a separate file (callback_state.json).

Recommended: add fcntl.flock around the existing save_state() body. One-line change. Test by spamming text + button taps and verifying last_update_id doesn't go backward.

Phase 3: Observability

☐ Add to next digest header: "Last 7d: {tier1_count} auto-applied, {feedback_count} verdicts logged". Reads tier1_log.jsonl and feedback.jsonl. Proves the loop is closing.
☐ Append to agent-council.md after first successful --promote-pending run with: {ts} | Agent: ai-intelligence | Session: promote-pending → OUTPUT: applied {thread_file} → commit {sha}, docs changed: [list].
☐ Add ONE-line stat to ~/cron-logs/ai-intelligence-scrape.log per scrape: tier1_promote: applied=N, threads_drained=M.

Test plan (run in this order)

1. Phase 1 verification (Tier 1 promote)

# Drop a test X thread URL into Telegram first (Cole does this from phone)
ls ~/ai-projects-local/browser-agent/threads/pending/
# Confirm at least 1 .md file present.

# Dry run
cd ~/ai-projects-local/mission-control/scripts/ai_intelligence
python3 tier1_apply.py --promote-pending --dry-run
# Expect: lists pending files + what doc edits would happen. No git commit.

# Real run
python3 tier1_apply.py --promote-pending
# Expect: 1+ commit in ~/ai-projects-local/docs, 1+ row in tier1_log.jsonl, processed .md files moved to done/.

# Verify existing path still works
python3 tier1_apply.py --dry-run
# Expect: matches today's "no eligible finds" (or finds what's actually eligible).

2. Phase 2 verification (Telegram cards)

# Trigger digest manually with cards on
TELEGRAM_DIGEST_CARDS=1 TELEGRAM_DIGEST_CARDS_K=2 python3 digest.py
# Expect: 2 inline-keyboard cards arrive on Telegram.

# Cole taps Done on card 1, Dismiss on card 2.
tail -2 ~/ai-projects/mission-control/data/ai-intelligence/feedback.jsonl
# Expect: 2 rows, verdicts "done" and "dismiss", via:"telegram", user:"cole".

# Confirm buttons strip from cards after tap.

# Test admin gate
# Have Caleb (operator) tap Dispatch on a card.
# Expect: Caleb's tap logs as "later" with note "only admin can dispatch."

3. End-to-end Mon/Thu digest

Wait for next scheduled digest (or invoke).
Confirm cards arrive, react to all 5.
Wait 12 hours, run next scrape. Confirm at least 1 row in tier1_log.jsonl (from any threads Cole queued in interim).
Check digest header on subsequent digest shows "Last 7d: 1 auto-applied, 5 verdicts logged".

What NOT to build (explicit non-goals)

/tier1 log, /approvals, /approve N, /reject N, /status commands — separate plan after #1+#2 prove out.
Voice → DISPATCH task — separate plan.
source_class field on finds.jsonl — original plan proposed this; Opus showed it doesn't work. Skip entirely.
Threshold change 80 → 75 — Opus showed official finds score 53-69, so a threshold drop has no effect. This is a rubric problem, not a threshold problem. If you want more official-source Tier 1 activity, the right fix is rubric tuning in score.py, not gate-relaxing in tier1_apply.py. Out of scope for this plan.
feedback_operator.jsonl — dead file nothing reads. Operator dispatch attempts log as later with a chat warning instead.

Key file references

~/ai-projects-local/mission-control/scripts/ai_intelligence/tier1_apply.py (Phase 1 target)
~/ai-projects-local/mission-control/scripts/ai_intelligence/run.sh (orchestration wiring, Phase 1)
~/ai-projects-local/mission-control/scripts/ai_intelligence/digest.py (~line 339-351 is where to add send_digest_cards)
~/ai-projects-local/mission-control/scripts/ai_intelligence/telegram.py (add 3 helpers)
~/ai-projects-local/mission-control/scripts/telegram_watcher.py (~line 726-748 callback branch)
~/ai-projects-local/mission-control/scripts/ai_intelligence/feedback.py (add user param to record())
~/ai-projects-local/browser-agent/threads/pending/ (source for --promote-pending)
~/ai-projects-local/browser-agent/threads/done/ (destination after processing)
~/ai-projects/mission-control/data/ai-intelligence/feedback.jsonl (verdict log)
~/ai-projects/mission-control/data/ai-intelligence/tier1_log.jsonl (auto-apply audit trail — will be created on first run)

Cole's preferences relevant here

No backwards-compat shims — if old behavior is replaced, replace it cleanly. Don't keep both paths "just in case."
Default to no comments in code; only when the why is non-obvious.
Hard kill switches required on both changes. For Phase 1 = no flag means existing behavior preserved. For Phase 2 = TELEGRAM_DIGEST_CARDS=0 reverts to text-only digest.
Surgical changes — don't refactor adjacent code. Match existing style of the file you're editing.
Verification before claiming done — run the test plan above and show output. Don't claim "verified" unless you actually ran it.
WarehouseAPI is NOT involved in this work — don't touch it.

What the prior session (this one) confirmed

Current tier1_log.jsonl: does not exist (0 applies).
Current feedback.jsonl: 1 row from 2026-06-12.
Current pending/: 1 thread (2026-06-16-2066882971374678057-*.md) waiting for digest drain.
Current done/: 10 threads processed since late May, regular cadence.
Telegram watcher has zero callback_query / inline-keyboard support today (greenfield for Phase 2).
Scrape ran successfully this morning 06:44 — pipeline upstream of these changes is healthy.

If you discover anything else broken upstream while doing this work, log it to agent-council.md rather than fixing it inline. Stay scoped.

Build Phase 1 first. Verify it produces tier1_log.jsonl rows. THEN build Phase 2. Phase 1 is lower-risk (file-system + existing Claude session pattern), and once it ships, Cole gets visible Tier 1 activity even before Phase 2 lands.

Good luck. Update execution log in ~/ai-projects/mission-control/plans/telegram-digest-loop-closer.md as you work.

Source: ~/ai-projects/mission-control/plans/HANDOFF-telegram-loop-closer.md