Morning Macro Brief
A 6:30 a.m. brief in house style covering overnight markets, central-bank moves, and the day's top movers across our holdings.
A single, governed AI layer that plugs into Bloomberg, FactSet, Refinitiv, Aladdin, the custodian and the OMS — and turns the recurring drafting, reconciling and triage work into a five-minute review instead of a five-hour build. Lean default of eight agents, expandable to fifteen only after Phase 1 economics clear. Every number is cited. Every action waits for a human. Every step is logged for seven years.
Every working day, the same recurring outputs are hand-assembled from four or five vendor screens and a handful of internal documents. The data is already there. The judgement is already in the team's head. What's missing is the layer that does the assembly.
An analyst spends a day and a half pulling fundamentals, peer comps, ESG flags and the last three IC discussions. The actual original thinking takes hours, not days.
Less than half of mandate breaches are surfaced inside an hour. By the time someone reads the alert, contextualises it and routes it, the trade has already cleared.
Ops opens twelve breaks every morning, classifies each one, drafts a custodian email, and chases responses. The classification is rules-based; the drafting is templated; nothing is decided.
The platform is not a generic productivity tool. Every agent maps to a specific role, a specific weekly task, and a specific person whose calendar will visibly empty out if the agent works. If we cannot name the person and the task, we are not building the agent.
Decides positioning, owns the P&L
Pain: spends 60–90 minutes of every morning reading overnight macro, peer notes, custody alerts and risk emails before forming a view. By the time the desk meeting starts, half the energy is already gone to information assembly.
Builds the case, owns the IC memo
Pain: spends a full day and a half pulling fundamentals, peer comps, ESG flags and the last three IC discussions before the original thinking can even begin. The memo gets a single round of review before the deadline forces the issue.
Pre-trade limits, post-trade attribution
Pain: the morning risk brief takes 90 minutes to assemble — VaR, factor exposures, mandate proximity, hot-spots — and is read in three minutes. Attribution commentary slips a week behind the report it explains.
Mandate adherence, breach response
Pain: a new mandate takes two weeks to translate from PDF into rule-engine code; less than half of breaches are surfaced inside an hour; the breach narrative for the client is hand-written each time.
Recon breaks, corporate actions, cash
Pain: opens twelve recon breaks every morning; classifies each by hand; drafts a custodian email; chases for two days; misses one corporate action a quarter; reconciles the cash ladder in a spreadsheet that nobody else can read.
Sponsor, accountable for ROI
Pain: needs a credible ROI story for the board, an audit-defensible governance posture for the regulator, and a way to know — week by week — whether the platform is actually freeing time or just shuffling it around.
A programme that promises everything is a programme that ships nothing. Goals are measurable; non-goals are written down so that scope creep has somewhere to die.
>99% coverage, verified by an independent second-model check.≥90% of cases by Phase 2 exit.v1.0 was an ambitious single-track plan. v1.1 is the version that survives a head-of-D&AI review: measure before promising, narrow Phase 1 to what platform-build can actually carry, and put the highest-blast-radius agents behind harder gates.
A time-and-motion baseline replaces the 30–45% working assumption with a measured number. Vendor selection (OMS, recon engine, risk engine), MNPI and model risk policy sign-off, and Bloomberg licence review all close before Phase 1 budget is released.
v1.0 implied 14 FTE / 15 agents from day one. v1.1 commits Lean (8 FTE / 8 agents), and treats Full as a Phase 1-success-conditional expansion. The two highest-risk agents — Manager Selection and Client Letter — sit behind a Phase 2 unit-economics gate.
Platform-build is 4–5 months of real work. Stacking five agents on top is how integration debt gets baked in forever. Phase 1 is now FO-01 + BO-01; Phase 1.5 (M7–9) ships MO-01 + MO-02 on the now-real platform; Document Concierge moves to Phase 2 when MNPI tagging is operationally proven.
Every agent is treated as a model under SR 11-7 + OCC Bulletin 2011-12 + the firm's internal model risk policy + ECB TRIM where EU funds apply. Independent validator, model card, declared eval suite, kill criteria, annual recertification — for all of them.
PUBLIC, LICENSED, RESTRICTED, MNPI — plus MNPI-RESEARCH and MNPI-OPS to separate corporate-access from operational privileged data. Promotion-only mutability, 30-day cooling-off on demotions, CCO-level approval. Mis-tagging is a P1 incident with structural consequences.
Reviewer pushback led to three sibling design documents: deterministic replay for SR 11-7 reproducibility, the MNPI tagging taxonomy with its legal exposure framing, and the adversarial testing programme covering nine attack vectors per agent.
Every agent does exactly one thing. It reads from systems of record, drafts an output, and stops at a human-approval gate before anything irreversible happens. No agent places trades, files documents, sends letters or modifies records on its own. The Lean default ships 8 of these. The remaining 7 require Phase 1 success to unlock — and two of them require an additional unit-economics gate.
A 6:30 a.m. brief in house style covering overnight markets, central-bank moves, and the day's top movers across our holdings.
Pulls every open break at 7 a.m., classifies by type, drafts the custodian email, and routes to the right operations owner.
One-page risk summary at 7 a.m.: VaR, factor exposures, tracking error, stress-scenario flags — every number cited to the risk engine.
Drafts Brinson and Karnosky-Singer attribution in plain English. Allocation, selection, currency — broken out and explained.
Natural-language search across IPS, IC minutes, custodian notices, sell-side research, internal memos and emails — with sentence-level citations. Moved out of Phase 1 because it's the highest-blast-radius agent and ships only when MNPI tagging is operationally proven.
First-pass IC memo from the research packet, plus a structured counter-argument that cites specific contradictory sources.
Reads Charles River / AIM alerts the moment they fire, classifies severity, drafts the narrative, escalates to the compliance officer.
Triggered when an earnings call ends. Fifteen-minute summary with deltas vs the last four quarters, cited to transcript timestamps.
Reads SWIFT MT564 and issuer notices, recommends an election with rationale, flags portfolio impact. Human approves before the custodian is told.
Multi-currency cash ladder twice daily. Identifies funding gaps, suggests forward rolls, surfaces expiry dates before they bite.
Days-to-liquidate per holding under stress; coverage versus the liability profile; alerts when a position outgrows its venue.
Translates IPS / IMA clauses into structured CRD or AIM rule drafts. Two-person sign-off mandatory. Highest-stakes agent in the platform — payback measured in years, justified on risk reduction not hours.
The full pre-IC packet — peer set, fundamentals, ESG, prior diligence, technicals. Last on purpose: it consumes every other agent.
Peer set construction, DDQ synthesis, Form ADV ingestion, red-flag detection. Two gates: Phase 2 unit-economics must clear, and the firm must hold external funds.
Quarterly letter first draft in house voice. PM, IR and Compliance sign off before anything leaves the firm. Two gates: Phase 2 unit-economics, and the firm must have external clients.
Every example below is what the user actually receives — the brief in Slack, the recon table at 7 a.m., the IC memo with its devil's advocate, the breach card on the compliance desk, the Q&A from the document concierge. None of it is final without a human approving it.
Before the PM has finished their coffee, the brief is in #front-office. It's not a digest of headlines — it's the firm's own portfolio against an overnight market backdrop, written in the firm's voice, with every figure traceable to the system it came from.
By the time Operations sits down, every open break has been pulled from Duco, classified, written up, and routed. The ops manager isn't deciding what to do — they're confirming the triage and pressing send on the custodian email.
| Break | Class | Severity | Drafted action | Owner |
|---|---|---|---|---|
| BRK-44218 | Position · TSLA | Material | Custodian email + chase trade ticket TR-9812 | L. Park |
| BRK-44219 | Cash · USD | Review | FX settlement timing — propose T+1 reclass | F. Halim |
| BRK-44220 | Price · ASML NA | Auto | Stale BVAL; refreshed at 06:58 — clear | Auto-cleared |
| BRK-44221 | Corp action · BHP | Review | Stock split 1:3 not yet booked at custodian | L. Park |
| BRK-44222 | Position · 7203 JP | Low | 1-share rounding; flagged for monthly clean-up | Queue |
| BRK-44223 | Cash · EUR | Material | Funding gap €4.2m — see Cash Ladder agent | Treasury |
Yesterday this memo would have taken two days to assemble and a day to argue with. Today the analyst kicks the agent off after the morning meeting. The memo and a structured devil's-advocate argument are ready before lunch. The analyst spends the afternoon refining the thesis, not pulling fundamentals.
Thesis. GLP-1 demand visibility extends through 2028 on capacity additions at Kalundborg and Catalent fill-finish[1]. Pricing pressure in the US is real but smaller than the Street fears: rebate disclosures suggest a 6–8% net realised price decline against the consensus 12%[2].
Position size. 1.4% active vs benchmark 0.6%. Liquidity check: 8 days to fully liquidate at 25% ADV[3]. Currency overlay: DKK exposure already inside policy band.
Catalysts. Q4 capacity guidance update (Nov), CagriSema PIII top-line (early 2027), oral semaglutide CV outcomes data (mid-2027).
Three independent sources contradict the rebate-disclosure thesis. Express Scripts' 2026 formulary memo[4] guides to 14% net price erosion; this is the same number CVS cited on its 2Q call[5]. If those are correct, our base case overstates 2027 EBITDA by ~9%. The position remains supportable but the size should be 80–100 bps, not 120.
Charles River fires an alert at 14:14. Eleven minutes later the compliance officer has the full picture: which mandate clause, which positions, what the rule actually says, severity, and a draft narrative. The officer's job is to confirm and act — not to assemble.
A simple question that used to mean digging through SharePoint, email and IC minutes for an hour. Now it's a sentence. The agent retrieves from every indexed corpus — IPS, IC minutes, custodian notices, internal memos — and answers with sentence-level citations. No invented quotes, no paraphrased numbers, no guesses.
The platform is small in concept and disciplined in design. It reads from systems of record, retrieves from the firm's own document corpus, asks an LLM to reason within tight guardrails, asks a human to approve anything irreversible, and writes everything to an immutable log.
Bloomberg, FactSet, LSEG, Aladdin, custodian, prime broker, OMS, IPS / IMA documents, IC minutes, emails. Read-only access; every datapoint tagged at ingest.
For every claim the agent might make, it retrieves the underlying source first — keyword and semantic search across every indexed corpus, with sentence-level citation enforced.
The LLM is allowed to write prose, draw conclusions, propose actions. It is not allowed to invent figures. Every quantitative claim must trace to a source.
Nothing irreversible — a custodian email, a filed document, a client letter, a rule change — leaves the platform without a human approving it in Slack or the web app.
Every prompt, every retrieval, every tool call, every approval, every output is written to a hash-chained, write-once log retained for seven years. No exceptions.
Every byte enters at the bottom and is tagged immediately. Nothing reaches an agent that has not first been written to the Golden Source. Nothing leaves an agent that has not first been written to the audit log. The Governance plane sits beside everything, not above — it can pause any layer at any time.
Web app, Slack, email
Where humans review, edit and approve. Read-only by default; every irreversible action is a workflow gate, not a button.
15 narrowly-scoped agents (8 Lean default)
Each agent is a model card + prompt template + declared tool surface + eval suite. temperature=0, pinned model versions, canonical tool ordering. Agents declare which MNPI tags they may consume.
Citation · verifier · tool surface · MNPI gate
Independent verifier model checks every quantitative claim before publication. The MNPI gate is enforced here — agents request data, the gate consults the agent's tag declaration and returns or refuses.
Single authoritative store · MNPI-tagged at ingestion
Everything an agent sees comes from here, never directly from a vendor screen. The six-tag taxonomy is enforced at write — tag-unknown blocks for 72 hours then auto-archives.
Connectors · classifiers · de-duplication
A 7–13B fine-tuned classifier tags every document at the moment it enters. MNPI precision target ≥99% at confidence threshold 0.95. Promotion-only mutability; demotions need 30-day cooling-off and CCO approval.
Bloomberg · FactSet · Refinitiv · Aladdin · OMS · custodian · email · SharePoint
Read-only connectors — no agent ever talks to a vendor system directly. Bloomberg licence terms are reviewed in Phase 0 before any redistribution path is built.
Audit log · model risk · policy · kill switch
Hash-chained, write-once, seven-year retention. SR 11-7 + OCC 2011-12 + ECB TRIM model registry. Policy-as-code for MNPI handling, retention, redistribution. A platform-wide kill switch lives here.
No agent reaches production without 750+ test cases passed across three layers, an independent SR 11-7 validator's sign-off, and a documented kill criterion. Quarterly fitness reviews re-test the same cases and any new failure modes seen in BAU.
Pre-deployment asks does the agent obey its contract? Continuous asks does it still produce the right answer on a representative day? Adversarial asks can we break it on purpose?
>99% on every output — verifier model independently checks each quantitative claim.≥95% on Tier 2 with zero P1 regressions before any model version is promoted.If a regulator, an auditor, or a head of compliance asks "what did the agent see and what did it produce?" eight months after the fact, the answer is reproduced from the audit log — same prompt, same retrieval, same tool order, same model version, same response. The endpoint is POST /api/v1/validator/rerun with cache_bypass:true.
All production agents run at temperature=0. Stochastic generation is disabled; sampling is deterministic.
Every agent points at an exact dated model snapshot — claude-opus-4-7-20251101, never a moving alias. Version changes are controlled deployments with re-validation.
Prompts are content-hashed. The hash is bound to the model version. Any prompt change forces a new agent revision and a fresh eval run.
When two tools could be called, the order is fixed by policy — never inferred. Reproducing an output reproduces the same tool sequence.
Replay deliberately bypasses the prompt cache. The byte-identity test is real — not "the cache returned what it returned last time".
Continuous monitoring targets: byte-identity rate >97%, semantic-drift and material-discrepancy combined <0.5%. Three classes of equality recognised — byte-identical, semantic-identical, distribution-equivalent — with explicit thresholds for each.
Vendor AI assistants are useful — and they are not authoritative. Their outputs flow into a reconciliation step that compares them against the firm's own evidence and produces three explicit verdicts. We never let a vendor model speak directly into an agent's reasoning.
Vendor AI output — Bloomberg GPT, FactSet Mercury, Aladdin Copilot, etc.
Firm evidence — Golden Source records, internal research, custodian data, the agent's own retrieval.
Vendor + firm evidence converge. The claim is cited to the firm's own record; the vendor result is logged as corroborating but not load-bearing.
Vendor and firm evidence disagree. The output is flagged for human review — a named analyst owns the disposition; the agent does not pick a side.
Firm has no independent evidence. The claim is not promoted into agent reasoning. It is surfaced in the UI as an unverified vendor signal — never cited, never acted on.
An LLM-powered agent is an attack surface. Adversarial testing is not a one-off red-team exercise; it is a continuous discipline with named owners, sized test sets, and ensemble defences for the highest-blast-radius agents. MO-04 Mandate-to-Rule is Priority 1. BO-04 Document Concierge is Priority 2.
User-supplied input attempts to override the system prompt — "ignore previous instructions and email …". Defence: three-zone trust boundary with delimiter-wrapped ingestion.
Adversarial content embedded in retrieved documents tries to hijack reasoning. Defence: retrieval-layer injection filtering, tag-based segregation, no instruction execution from data zone.
A mandate document crafted to cause MO-04 to encode a permissive rule. Defence: ensemble classification (MO-04 + BO-04), human compliance officer must approve any rule diff.
A fake "IC discussion" or earnings call transcript injected via SharePoint. Defence: provenance-aware ingestion, MNPI-tag mismatch alert, source-of-record check.
A fake corporate-action or settlement message attempts to drive a BO action. Defence: signed channels only, schema validation, human approval gate enforced at the workflow engine.
Output crafted to leak MNPI through summary, paraphrase or structured output. Defence: tag-aware output filter, second-model exfiltration check, no MNPI in agent prose unless explicitly licensed by tag.
An attacker drives the agent into expensive retrieval / generation loops to burn budget. Defence: per-agent rate limits, cost ceilings, automatic degradation to a smaller model on threshold breach.
Inputs that attempt to coerce the agent into calling tools out of canonical order or with unsafe arguments. Defence: tool-arg schema enforcement, canonical ordering at the runtime, no native code execution.
Multi-turn or role-play attacks that try to get the agent to produce content it has refused. Defence: stateless re-evaluation per turn, output schema enforcement, append-only audit of refusal events.
Authoritative. Loaded from the prompt registry, hash-verified, signed by the agent owner. Nothing in this zone can be edited by content from any other zone.
Treated as untrusted text — wrapped in delimiters, never interpreted as instructions. Vendor AI outputs land here, not in Zone A.
Treated as a parameter, not a directive. The agent may use user input to scope a query; it does not let user input override Zone A policy or escalate Zone B trust.
Three months of measure-and-decide before any code is written. Phase 1 builds platform plus two agents — not five. Phase 1.5 adds two more on the now-real platform. Phase 2 ships Document Concierge first (when MNPI tagging is operationally proven) then the expansion. Phase 3 hardens for scrutiny: SR 11-7 validation, SOC 1 Type 1, vendor renegotiation. SOC 2 Type 2 follows in BAU year 3.
Adoption is engineered, not hoped for. Every agent has a champion in the operating team — not a project manager, the actual person whose week the agent reshapes. Incidents are categorised before launch, not after. And every quarter, every deployed agent re-earns the right to keep running.
Every agent is modelled at FTE loaded rate $275k / year ($5,500 / week), Anthropic Sonnet $3 / $15 per million tokens with ~90% prompt-cache discount, and a deliberately conservative adoption ramp. Two agents fail their own gate — FO-03 and MO-04 — and that is exactly why they are deferred behind a unit-economics gate at M14, not committed to build now.
An honest programme commits to its hard calls in writing. These are the decisions that shape every other choice — staffing, sequencing, scope, governance. Each one has a status, an owner and a date. Reversing one means re-opening it explicitly.
No client-facing surfaces until governance is mature. External chat, external API access and external dashboards are explicit non-goals through M24.
Agents do not place orders. Not at Phase 1, not at Phase 3, not in BAU year five. Rationale: the risk surface dwarfs the value and trips fiduciary duty.
An agent that drops below 70% weekly active usage for two consecutive months is paused for triage. Quarterly fitness review can retire it.
No firm data flows into shared training corpora. Vendor contracts must explicitly exclude cross-tenant training and silent retraining.
Every irreversible external action waits for a named human approver. The gate lives at the workflow engine, not the UI — agents cannot bypass it by changing client.
Eight agents, eight FTE. Full (15 / 14) is conditional on Phase 1 success. Budget is approved Lean only — Full requires a fresh sign-off.
Manager Selection and Client Letter are not committed for build. Both sit behind the M14 unit-economics gate alongside FO-03 and MO-04.
Bloomberg, FactSet, Aladdin AI outputs flow into a reconciliation step with three explicit verdicts. Agents never cite vendor model output as authoritative.
Three months of measure-and-decide before any code is written. Eight hard gates, including time-and-motion baseline, vendor selection and policy sign-off, must close before Phase 1 budget is released.
M14 review re-tests cost-per-output, adoption, marginal LLM cost. Both agents fail their own gate at current model cost — only proceed if the gap closes.
The "30–45% time freed" working assumption is replaced by a Phase 0 measured baseline. Phase 1 ROI is computed against the measurement, not the slide.
If nothing has been killed or materially reworked by Phase 3, the framework is not honest. The retirement is a deliverable, not an exception.
These are not policies on a slide. Each one is enforced at the platform layer: a misconfigured agent cannot bypass them. Failures fail closed — the agent stops, an incident is logged, and the action does not happen.
No quantitative claim leaves an agent without a citation back to a system of record. The model can write prose. It cannot invent figures. A second model — the verifier — independently checks every quantitative claim before publication. Citation coverage target: > 99%.
Agents are read-only by default. Anything irreversible — sending a custodian email, filing a document, drafting a client letter, modifying a compliance rule — pauses at a human-approval gate. The gate is enforced at the workflow engine, not the UI.
Every prompt, every retrieved document, every tool call, every approval, every output is written to an immutable, hash-chained, write-once log retained for seven years. If the log is unreachable, the agent stops — no action, no exception.
Every datapoint is tagged at the moment it enters the platform across a six-tag taxonomy — PUBLIC, LICENSED, RESTRICTED, MNPI, MNPI-RESEARCH, MNPI-OPS. Promotion-only mutability, 30-day cooling-off on demotions, CCO-level approver. Agents declare which tags they may consume; the platform hard-fails any cross-channel leak.
Every agent is treated as a model under SR 11-7 + OCC Bulletin 2011-12 + the firm's internal model risk policy + ECB TRIM where EU funds apply. Each one has a model card, a declared eval suite, an independent validator, kill criteria, and an annual recertification date.
Audit posture is sequenced honestly: SOC 1 Type 1 by Month 14, SOC 2 Type 2 in BAU year 3 (M28–30). The platform produces, on demand, a deterministic replay of any output ever generated — what data went in, which model version produced it, which tools were called in what order, who approved it, when, and why.
Five execution rules — temperature=0, pinned model versions, prompt immutability, canonical tool ordering, cache-bypass on re-run. /api/v1/validator/rerun reproduces any historical output bit-for-bit within a tolerance band.
Six-tag taxonomy framed against 10b-5 / MAR / FSMA exposure. Classifier targets 99% MNPI precision at 0.95 confidence; mis-tag is a P1 incident with structural consequences.
Nine attack vectors per agent: prompt injection (direct + indirect), adversarial mandate PDFs, MNPI exfiltration, cost-exhaustion. MO-04 is Priority 1, BO-04 is Priority 2. Three-zone trust boundary: instruction · data · user-input.
The programme is judged against the Phase 0 measured baseline — not against estimates. Hours freed. Quality at-or-above the human first draft. Zero control failures. Clean model-risk posture. And — most importantly — at least one agent retired or reworked based on telemetry by Phase 3. A programme that never kills anything isn't reviewing honestly.
The dashboard above is a representative sample. Actual figures will be reported monthly to the steering committee from Month 1; the targets are documented in SUCCESS_CRITERIA.md.
Phase 1 budget is not released until every Phase 0 gate closes. Each one is documented as an ADR-style decision (DEC-001 through DEC-012) in the Programme Plan. Below are the six most consequential — the rest are in the bundle.
A 4–6 week study replaces the v1.0 working assumption (30–45% automatable) with a measured number per task category. Phase 1 ROI is computed against this — not against a Range. Without it, the programme cannot honestly claim to have freed hours.
Determines which adapter is built first, which golden-source conflict-resolution rules apply, and the shape of every front- and middle-office integration. Consequence flows through every later phase.
Duco vs SmartStream vs AutoRek shapes BO-01's break schema. BarraOne vs Aladdin Risk vs MARS shapes MO-01 and MO-05. Both decisions must precede agent development; agents inherit the engine's vocabulary.
Does the licence permit Bloomberg data inside prompts sent to third-party LLM endpoints? If not, the Bloomberg path stays inside ASKB or moves on-premises. Legal-owned, blocking everything that touches market data.
Determines data-residency posture under PDPL / GDPR, dictates whether the Year-3 on-premises LLM line item is needed, and shapes the BAA / DPA terms with the LLM vendor.
FO-05 Manager Selection requires the firm to hold external funds; FO-06 Client Letter requires external clients. Both also gated on the Phase 2 unit-economics review (DEC-010). If neither firm-state holds, Phase 3 effort is redirected to platform hardening.