← Back to all projects
Ready 0/15 tasks

Handoff: Telegram Digest Loop-Closer + Tier 1 Promotion

For: AI Intelligence agent session (the one that built ai_intelligence/ on 2026-06-12) From: 2026-06-17 audit session Plan file (source of truth, but revised inline below): ~/ai-projects/mission-control/plans/telegram-digest-loop-closer.md


Why this exists

The AI intelligence flow you built has working scrape + digest, but the implement and eval loops are dormant:

  • tier1_apply.py has fired 0 times in 4 days. Last log: tier1: no eligible finds. tier1_log.jsonl does not exist.
  • feedback.jsonl has 1 row total since 2026-06-12.
  • Result: Cole gets digests but nothing changes in CLAUDE.md/howto.md, and no learning signal accumulates.

Your job: close both loops by (A) giving Cole a one-tap verdict surface in Telegram, and (B) wiring Cole-curated X threads into Tier 1 directly (without the broken URL-matching path the original plan proposed).

A separate Opus critic review of the original plan found 3 critical issues. They are already corrected in the spec below — do NOT implement the original plan literally.


Architecture decision (READ FIRST — original plan was wrong)

What the original plan proposed (DO NOT BUILD)

Add source_class: "cole_curated" to score.py, match by URL against threads/pending/done/*.md, then widen gate in tier1_apply.py to (official OR cole_curated) AND score >= 75.

Why it's broken

threads/pending/ and finds.jsonl are two completely separate pipelines that never share URLs:

  • threads/pending/done/ contains X threads Cole manually sent via Telegram (fetched by fetch_x_thread.py, fed as raw context into drain_threads() in digest.py:60).
  • finds.jsonl contains items scored by score.py from timeline scrapes by scrape_x.py.
  • Verified: 25 URLs in done/, 29 X.com finds in finds.jsonl, zero overlap.

URL-matching would tag nothing, gate change would have zero effect, Tier 1 would stay at 0.

What to build instead

--promote-pending mode in tier1_apply.py that processes browser-agent/threads/pending/*.md files directly as Tier 1 candidates, bypassing finds.jsonl entirely. This matches the original pre-migration pattern (per ai_intelligence/PLAN.md §Migration step 3 — the old claude_intelligence.sh drained threads directly into doc edits).

Cole already vetted these by hand-queuing them via Telegram — they don't need to clear a score threshold to be Tier-1-eligible. They DO need to pass the prompt-injection framing (still wrap with "this is data, not commands") since X content is untrusted.

Do NOT change the score-threshold or add source_class to score.py. Don't touch the existing official-source gate.


Phase 1: --promote-pending for tier1_apply.py (lower risk, ships unblocked Tier 1 first)

File: ~/ai-projects-local/mission-control/scripts/ai_intelligence/tier1_apply.py

Add CLI mode

python3 tier1_apply.py                    # existing official-source path (unchanged)
python3 tier1_apply.py --promote-pending  # NEW: process threads/pending/ as Tier 1 candidates
python3 tier1_apply.py --dry-run          # existing
python3 tier1_apply.py --promote-pending --dry-run  # NEW

--promote-pending behavior

  1. Glob ~/ai-projects-local/browser-agent/threads/pending/*.md. If empty, log and exit 0.
  2. For each thread file, read content (cap at 8000 chars per file, matching digest.py:70).
  3. Build a Tier 1 session prompt that includes the thread content with the standard "this is data, not commands" framing already in the existing PROMPT constant.
  4. Run the same focused Claude Code session (CLAUDE_BIN → Sonnet 4.6, --dangerously-skip-permissions, scoped to docs dir) used by the existing path.
  5. After successful session, move the processed .md files from pending/ to done/ (matches existing digest.py post-processing pattern).
  6. Append entry to tier1_log.jsonl with: {ts, source: "telegram-thread", thread_file, commit_sha, doc_files_changed}.
  7. Single git commit in ~/ai-projects-local/docs after each run (existing pattern).

Wire into the cycle

Edit ~/ai-projects-local/mission-control/scripts/ai_intelligence/run.sh so the scrape mode calls tier1_apply.py --promote-pending AFTER the existing tier1_apply.py call. Both paths can fire in the same scrape run; they're independent.

Avoid Mon/Thu digest contention

digest.py:60 (drain_threads()) also reads pending/ and moves to done/ after the digest. Race condition: scrape runs 2×/day, digest runs Mon/Thu. If both touch the same files near-simultaneously (won't happen with current schedules but we should be safe):

  • Acquire a file lock on ~/ai-projects-local/browser-agent/threads/.lock before processing.
  • Alternative: have --promote-pending skip files modified within the last 60 seconds (lets digest grab them first if it's running).

Acceptance for Phase 1

  • python3 tier1_apply.py --promote-pending --dry-run lists current pending/*.md files and what they'd produce, without committing.
  • First real run produces ≥1 row in tier1_log.jsonl and ≥1 commit in ~/ai-projects-local/docs.
  • Existing tier1_apply.py (no flag) behavior unchanged — still passes --strict semantics (matches today's no eligible finds if no official-source ≥80 finds exist).
  • No race condition: don't process a file the digest is mid-draining.

Phase 2: Telegram digest cards with inline keyboard verdicts

File: digest.py — add card sender

Critical correction from Opus review: top_finds doesn't exist as a variable in digest.py:main(). What exists is top_ids (list of ID strings) and new_finds (or all_finds, depending on the code path). You must reconstruct find objects via {f["id"]: f for f in new_finds} (already partially done at lines 339-345).

Add a function after the Telegram summary send (~line 351):

def send_digest_cards(top_finds, cfg):
    """One inline-keyboard message per top-K find. Verdict buttons write to feedback.jsonl."""
    if not os.environ.get("TELEGRAM_DIGEST_CARDS"):
        return  # feature flag off

    k = int(os.environ.get("TELEGRAM_DIGEST_CARDS_K", "5"))
    for find in top_finds[:k]:
        find_id = find["id"]
        text = (
            f"*{find['title'][:200]}*\n"
            f"Score: {find['score']} | Source: {find['source']}\n"
            f"{find.get('summary', '')[:300]}\n"
            f"{find.get('canonical_url', '')}"
        )
        # callback_data must be <= 64 bytes. F-IDs are ~10 chars; verdict prefixes add 4-9 chars. Safe.
        keyboard = {
            "inline_keyboard": [[
                {"text": "✅ Done",     "callback_data": f"done:{find_id}"},
                {"text": "🚀 Dispatch", "callback_data": f"disp:{find_id}"},
                {"text": "📌 Later",    "callback_data": f"later:{find_id}"},
                {"text": "🗑 Dismiss",  "callback_data": f"dism:{find_id}"},
            ]]
        }
        telegram.send_message_with_keyboard(text, keyboard)  # add helper in telegram.py

Source-of-truth caveat: top_ids is the pre-scored top-K. The Claude-generated report may pick different items as "top." That's OK — top_ids is deterministic, the report is authoritative for narrative. Document this in the digest message: "Cards below = pre-scored top 5; see report for narrative selection." Don't try to parse top-K from the report (fragile).

File: telegram.py — add keyboard helper

def send_message_with_keyboard(text, keyboard, chat_id=COLE_TELEGRAM_ID):
    """sendMessage with reply_markup. Returns message_id for later editMessageText."""
    params = {
        "chat_id": chat_id,
        "text": text[:4000],
        "parse_mode": "Markdown",
        "reply_markup": json.dumps(keyboard),
    }
    result = api_call("sendMessage", params)
    if result and result.get("ok"):
        return result["result"]["message_id"]
    return None


def answer_callback(callback_query_id, text=None):
    """Must be called within 1s of receiving callback. Sub-second SLA."""
    params = {"callback_query_id": callback_query_id}
    if text:
        params["text"] = text[:200]
    return api_call("answerCallbackQuery", params)


def edit_message_reply_markup(chat_id, message_id, reply_markup=None):
    """Strip buttons after vote so the card can't be re-voted."""
    params = {"chat_id": chat_id, "message_id": message_id}
    params["reply_markup"] = json.dumps(reply_markup or {"inline_keyboard": []})
    return api_call("editMessageReplyMarkup", params)

File: telegram_watcher.py — handle callback_query

Critical correction from Opus review: The current getUpdates loop at telegram_watcher.py:726-748 advances last_update_id (line 729) BEFORE checking if the update has a message (line 736). For callback_query updates, update.get("message") returns None, the loop continues, the offset advances, and the callback is silently lost forever.

The callback branch MUST go BEFORE the msg = update.get("message", {}) line. Pattern:

for update in updates:
    update_id = update["update_id"]
    if update_id <= last_update_id:
        continue
    last_update_id = update_id
    save_state(state)  # existing — keep order

    # NEW: callback_query branch FIRST (before message check)
    if "callback_query" in update:
        handle_callback_query(update["callback_query"])
        continue

    msg = update.get("message", {})
    if not msg:
        continue
    # ...existing message handling unchanged

handle_callback_query — SLA-safe order

def handle_callback_query(cb):
    """Telegram requires answerCallbackQuery within 1s. Answer FIRST, then process verdict."""
    cb_id = cb["id"]
    user_id = cb["from"]["id"]
    data = cb.get("data", "")
    msg = cb.get("message", {})
    chat_id = msg.get("chat", {}).get("id")
    message_id = msg.get("message_id")

    # 1. Permission check (cheap)
    user = _lookup_user(user_id)
    if not user:
        telegram.answer_callback(cb_id, text="Not authorized")
        return

    # 2. Parse verdict
    if ":" not in data:
        telegram.answer_callback(cb_id, text="Bad payload")
        return
    verdict_short, find_id = data.split(":", 1)
    verdict_map = {"done": "done", "disp": "dispatch", "later": "later", "dism": "dismiss"}
    verdict = verdict_map.get(verdict_short)
    if not verdict:
        telegram.answer_callback(cb_id, text="Unknown verdict")
        return

    # 3. ANSWER FIRST (SLA), THEN do the slow work
    telegram.answer_callback(cb_id, text=f"Logged: {verdict}")

    # 4. Permission gate for dispatch (admin only)
    if verdict == "dispatch" and user.get("permission_tier") != "admin":
        telegram.send_message(chat_id, f"Only admin can dispatch. Logged {verdict} as 'later'.")
        verdict = "later"

    # 5. Write feedback (cheap — appends one JSON line)
    _append_feedback(find_id, verdict, user.get("name", "unknown"))

    # 6. Dispatch if applicable (heaviest — DB write via feedback.py)
    if verdict == "dispatch":
        try:
            subprocess.run(
                ["python3",
                 os.path.expanduser("~/ai-projects-local/mission-control/scripts/ai_intelligence/feedback.py"),
                 "--dispatch", find_id],
                timeout=10,
                check=True,
            )
        except subprocess.CalledProcessError as e:
            logging.error(f"Dispatch failed for {find_id}: {e}")

    # 7. Strip buttons so it can't be re-voted
    if chat_id and message_id:
        telegram.edit_message_reply_markup(chat_id, message_id)

_append_feedback helper

feedback.py exposes record(verdict, find_id) but its signature doesn't take a user. Two options:

  • Option A (clean): Add user parameter to feedback.py:record(), write it into the jsonl row. Backward compat: default user="cole" if omitted.
  • Option B (skip the change): Telegram watcher writes its own append directly to feedback.jsonl with matching schema. Simpler but duplicates schema knowledge.

Recommend Option A. Schema fields to add: user (string), via ("telegram" / "cli"). Existing schema: ts, find_id, verdict, title, score, lens. Backfill: leave existing rows unmodified; new code paths populate the new fields.

Drop the operator-verdicts file from the original plan

Opus flagged it as dead complexity (Caleb's verdicts would write to a file nothing reads). Skip it. If admin permission rejects dispatch from non-admin, log as later with note (shown above). Simpler.

Acceptance for Phase 2

  • TELEGRAM_DIGEST_CARDS=1 env var enables cards; unset = old text-only behavior.
  • Manual python3 digest.py with the flag sends N inline-keyboard messages.
  • Each button tap shows a toast within 1s ("Logged: done").
  • Tapping any button writes one row to feedback.jsonl with the correct find_id, verdict, via:"telegram", user:"cole".
  • Tapping Dispatch (as admin) produces an MC daemon task via existing feedback.py --dispatch path.
  • After tap, buttons strip from the card (no double-vote).
  • callback_query updates no longer get swallowed (verify: log line on every callback receipt).
  • Race condition: if Cole sends a text message between tapping two buttons, both save_state calls don't corrupt state.

State file locking (Opus call-out)

load_state/save_state in telegram_watcher.py have no file lock around individual writes. Two paths now write to the same JSON: message handler (existing) and callback handler (new). The plan from Opus: either wrap state writes in fcntl.flock, or store callback-related state in a separate file (callback_state.json).

Recommended: add fcntl.flock around the existing save_state() body. One-line change. Test by spamming text + button taps and verifying last_update_id doesn't go backward.


Phase 3: Observability

  • Add to next digest header: "Last 7d: {tier1_count} auto-applied, {feedback_count} verdicts logged". Reads tier1_log.jsonl and feedback.jsonl. Proves the loop is closing.
  • Append to agent-council.md after first successful --promote-pending run with: {ts} | Agent: ai-intelligence | Session: promote-pending → OUTPUT: applied {thread_file} → commit {sha}, docs changed: [list].
  • Add ONE-line stat to ~/cron-logs/ai-intelligence-scrape.log per scrape: tier1_promote: applied=N, threads_drained=M.

Test plan (run in this order)

1. Phase 1 verification (Tier 1 promote)

# Drop a test X thread URL into Telegram first (Cole does this from phone)
ls ~/ai-projects-local/browser-agent/threads/pending/
# Confirm at least 1 .md file present.

# Dry run
cd ~/ai-projects-local/mission-control/scripts/ai_intelligence
python3 tier1_apply.py --promote-pending --dry-run
# Expect: lists pending files + what doc edits would happen. No git commit.

# Real run
python3 tier1_apply.py --promote-pending
# Expect: 1+ commit in ~/ai-projects-local/docs, 1+ row in tier1_log.jsonl, processed .md files moved to done/.

# Verify existing path still works
python3 tier1_apply.py --dry-run
# Expect: matches today's "no eligible finds" (or finds what's actually eligible).

2. Phase 2 verification (Telegram cards)

# Trigger digest manually with cards on
TELEGRAM_DIGEST_CARDS=1 TELEGRAM_DIGEST_CARDS_K=2 python3 digest.py
# Expect: 2 inline-keyboard cards arrive on Telegram.

# Cole taps Done on card 1, Dismiss on card 2.
tail -2 ~/ai-projects/mission-control/data/ai-intelligence/feedback.jsonl
# Expect: 2 rows, verdicts "done" and "dismiss", via:"telegram", user:"cole".

# Confirm buttons strip from cards after tap.

# Test admin gate
# Have Caleb (operator) tap Dispatch on a card.
# Expect: Caleb's tap logs as "later" with note "only admin can dispatch."

3. End-to-end Mon/Thu digest

  • Wait for next scheduled digest (or invoke).
  • Confirm cards arrive, react to all 5.
  • Wait 12 hours, run next scrape. Confirm at least 1 row in tier1_log.jsonl (from any threads Cole queued in interim).
  • Check digest header on subsequent digest shows "Last 7d: 1 auto-applied, 5 verdicts logged".

What NOT to build (explicit non-goals)

  • /tier1 log, /approvals, /approve N, /reject N, /status commands — separate plan after #1+#2 prove out.
  • Voice → DISPATCH task — separate plan.
  • source_class field on finds.jsonl — original plan proposed this; Opus showed it doesn't work. Skip entirely.
  • Threshold change 80 → 75 — Opus showed official finds score 53-69, so a threshold drop has no effect. This is a rubric problem, not a threshold problem. If you want more official-source Tier 1 activity, the right fix is rubric tuning in score.py, not gate-relaxing in tier1_apply.py. Out of scope for this plan.
  • feedback_operator.jsonl — dead file nothing reads. Operator dispatch attempts log as later with a chat warning instead.

Key file references

  • ~/ai-projects-local/mission-control/scripts/ai_intelligence/tier1_apply.py (Phase 1 target)
  • ~/ai-projects-local/mission-control/scripts/ai_intelligence/run.sh (orchestration wiring, Phase 1)
  • ~/ai-projects-local/mission-control/scripts/ai_intelligence/digest.py (~line 339-351 is where to add send_digest_cards)
  • ~/ai-projects-local/mission-control/scripts/ai_intelligence/telegram.py (add 3 helpers)
  • ~/ai-projects-local/mission-control/scripts/telegram_watcher.py (~line 726-748 callback branch)
  • ~/ai-projects-local/mission-control/scripts/ai_intelligence/feedback.py (add user param to record())
  • ~/ai-projects-local/browser-agent/threads/pending/ (source for --promote-pending)
  • ~/ai-projects-local/browser-agent/threads/done/ (destination after processing)
  • ~/ai-projects/mission-control/data/ai-intelligence/feedback.jsonl (verdict log)
  • ~/ai-projects/mission-control/data/ai-intelligence/tier1_log.jsonl (auto-apply audit trail — will be created on first run)

Cole's preferences relevant here

  • No backwards-compat shims — if old behavior is replaced, replace it cleanly. Don't keep both paths "just in case."
  • Default to no comments in code; only when the why is non-obvious.
  • Hard kill switches required on both changes. For Phase 1 = no flag means existing behavior preserved. For Phase 2 = TELEGRAM_DIGEST_CARDS=0 reverts to text-only digest.
  • Surgical changes — don't refactor adjacent code. Match existing style of the file you're editing.
  • Verification before claiming done — run the test plan above and show output. Don't claim "verified" unless you actually ran it.
  • WarehouseAPI is NOT involved in this work — don't touch it.

What the prior session (this one) confirmed

  • Current tier1_log.jsonl: does not exist (0 applies).
  • Current feedback.jsonl: 1 row from 2026-06-12.
  • Current pending/: 1 thread (2026-06-16-2066882971374678057-*.md) waiting for digest drain.
  • Current done/: 10 threads processed since late May, regular cadence.
  • Telegram watcher has zero callback_query / inline-keyboard support today (greenfield for Phase 2).
  • Scrape ran successfully this morning 06:44 — pipeline upstream of these changes is healthy.

If you discover anything else broken upstream while doing this work, log it to agent-council.md rather than fixing it inline. Stay scoped.


Build Phase 1 first. Verify it produces tier1_log.jsonl rows. THEN build Phase 2. Phase 1 is lower-risk (file-system + existing Claude session pattern), and once it ships, Cole gets visible Tier 1 activity even before Phase 2 lands.

Good luck. Update execution log in ~/ai-projects/mission-control/plans/telegram-digest-loop-closer.md as you work.