The IR Digest

Edition 12 · 17 May 2026 · Subscribe on Substack

On 29 April, an Amsterdam developer published the local AI setup he had built to replace his Claude Max subscription. On 2 May, The Register published a guide to doing the same. On 8 May, Coder shipped a self-hosted agent runtime designed to keep prompts inside the customer's network. None of these events is individually decisive. Together, they are a reason to look again at the assumptions underneath most current AI procurement specifications.

Deep Dive: Local-First Agent Stacks

An Amsterdam Developer Cancels His Claude Max Subscription

Willem van den Ende is an independent consultant in Amsterdam who has worked as a software developer for more than twenty years. On 28 April 2026 he posted on LinkedIn that he had cancelled his Claude Max subscription, which was costing him one hundred US dollars a month. The next day he published a piece on his own website setting out the local setup that had replaced it. The hardware is a refurbished MacBook Pro M3 Max with 64 gigabytes of RAM. The models are open weights, served by llama.cpp through a script that rebuilds against current releases. The coding harness is Pi.dev. Routine chat and brainstorming run through GPTEL inside Emacs.

His reason for switching is set out plainly. Over the month before he wrote, the open-weights models he uses had become roughly twice as fast on the same hardware, with capability he judged sufficient for the work he uses them for. The hosted subscription, on his account, was no longer doing enough that local could not do to justify the monthly cost.

29 Apr
van den Ende publishes his local-first setup, cancels Claude Max
2 May
The Register publishes a guide to running local AI coding agents
8 May
Coder Agents enters beta as a self-hosted, in-network runtime

Source: Willem van den Ende: My local agentic dev setup today · The Register: How to roll your own local AI coding agents · SD Times: Coder Agents launch

Three further events in the past fortnight are worth holding against the van den Ende piece. On 2 May, The Register, the UK's principal mainstream technology publication for working IT professionals, ran a guide to setting up local AI coding agents. On 6 May, in a live blog from Anthropic's Code w/ Claude conference, Simon Willison reported that managed multi-agent orchestration and persistent Claude Code routines were being released as first-party Anthropic features. On 8 May, the developer-environment company Coder launched Coder Agents in beta, a self-hosted runtime designed to keep both source code and prompts inside the customer's network perimeter. The van den Ende piece, the Register guide, and the Coder launch are points along the same line. The Anthropic announcement is at the other end of it. Both ends of the spectrum gained credible new options in the same fortnight, while the hosted-plus-vendor-orchestration middle, where most procurement specifications are currently written, did not.


What This Means for Procurement

Three Arguments Worth Re-examining

For most of the last two years, the procurement question for AI tooling has been which hosted vendor to choose. The choice between hosted and self-hosted was not, in practice, a live one. Three arguments did most of the work in keeping it that way.

The three arguments. First, that local open-weights models were not good enough to support production work. This was correct in 2024 and defensible through most of 2025. For coding, drafting, and retrieval-augmented tasks, it has become difficult to maintain in mid-2026 without specific evidence to the contrary. Second, that running models locally was operationally beyond the reach of most internal teams. This remains true for many organisations, but the existence of products such as Coder Agents and Pi.dev means the claim now needs to be tested against current options rather than assumed. Third, that the cost of self-hosting exceeded the data-residency benefit. The hosted-versus-local cost ratio was an order of magnitude apart in 2024. For many workloads it is now closer to two to one, and continues to narrow.

None of this is an argument for moving to a local-first posture. For most organisations the hosted default will remain the right choice for some time. The argument is for refreshing the working that supports the current decision. A procurement choice that was right in late 2024 may still be right in May 2026, but the comparison should be on current numbers and on file. For organisations subject to public-sector procurement scrutiny, charity-funder requirements, or sector-specific data-residency rules, an out-of-date comparison is itself becoming an audit finding.

The supplier-side conversation is harder to evidence in public, but the pattern is starting to show. In recent client work, Intelligent Resilience has seen three vendors that previously offered only hosted AI features add self-hosted or hybrid options to their roadmaps in the past month. The reasons are not in any press release. The trigger appears to be procurement conversations with European public-sector and regulated-industry buyers. Renewal discussions in 2026 are a reasonable point at which to ask suppliers whether self-hosted options exist or are planned. The answer is useful regardless of whether the option is taken up.


Governance & Assurance

What This Means for the Risk Register

In most of the risk registers Intelligent Resilience has reviewed in 2026, generative AI sits as a single line, with a single owner and a single risk score. That treatment was defensible while the architectural choices behind every AI workflow in the organisation were broadly the same. It is becoming less so. Three architectural patterns are now in regular use, and each carries a different data-residency, audit, and blast-radius profile.

The first is the fully managed pattern: a hosted frontier model, vendor-supplied orchestration, with prompts and outputs traversing the vendor's infrastructure. The second is the self-hosted pattern: an open-weights model running on infrastructure the organisation controls, with orchestration on the same network. The third, and currently the most common in practice, is the hybrid pattern: a hosted model accessed through self-hosted scaffolding, with routing to different models depending on the sensitivity of the input. A single risk-register line cannot honestly describe three different architectures with different control profiles, and trying to do so makes the register harder to defend in an audit.

A diagnostic question for the executive team, in writing, before the next risk committee. For each material AI workflow the organisation operates today, can the responsible executive name the model, the hosting jurisdiction, the orchestration layer, and the individual accountable for the workflow's behaviour? If the workflow were moved between the three patterns above, who would have to approve the change? Where the answers are not readily available, the issue is not weakness in the controls. It is that the architecture has not been made legible to the people who are formally accountable for it. That is a question of governance documentation, and it can be closed within a single committee cycle.

The EU AI Act high-risk deadline of 2 August, set out in Edition 11, is a useful forcing date for closing this gap. The Act's obligations on data governance, technical documentation, and human oversight are easier to discharge against a documented architecture than against an undocumented one. Documenting the architecture in May, while the work is not under deadline pressure, is likely to produce a better result than documenting it in late July, when it will be.

Source: EU AI Act high-risk obligations: implementation timeline


Horizon Watch

Three Competing Definitions of the Agent Stack

Three publications in the past fortnight have proposed layered models of the modern AI agent stack. Aishwarya Naresh Reganti's The AI Agent Stack in 2026 in The Nuanced Perspective; the Glean engineering blog's emerging agent architecture; and MindStudio's six layers every AI builder needs to understand. The three models overlap in places but do not coincide. None of them maps directly onto the eight-layer taxonomy that Intelligent Resilience uses for its own assessments. This is normal at this stage of a field. Public taxonomies in a fast-moving area tend to proliferate before they consolidate, and engineering-led taxonomies typically settle first.

The IR taxonomy is governance-oriented. The public ones are largely engineering-oriented. The two will need to be reconciled eventually in contracts, specifications, and audit reports, and the side whose definitions are written down at the time of the conversation will have the easier argument to make. For organisations drafting AI-related procurement language in 2026, the practical recommendation is to attach a short internal glossary of the terms used in the specification, agreed within the organisation before it is sent to suppliers. The cost of doing so is small. The cost of working from incompatible definitions of agent, orchestration, or credentials in a 2027 contractual dispute is not.

Source: Reganti: The AI Agent Stack in 2026 · Glean: The emerging agent architecture · MindStudio: The six layers of agent infrastructure


For CISOs and Technical Leaders

Commission a current one-page comparison of the hosted, self-hosted, and hybrid patterns for the two or three AI workflows the organisation depends on today. Three columns: cost over twelve months, data-residency posture, and operational complexity scored against the team you actually have. The aim of the exercise is not to switch patterns. It is to have current working on file, so that when procurement, audit, or a funder asks why the organisation is on the pattern it is on, the answer can be produced quickly and with evidence. If the most recent version of that working is more than nine months old, treat it as out of date.

For Heads of Governance, Risk and Compliance

Split the single generative-AI line on the risk register into three entries, one per architectural pattern in current use, each with its own owner, residency description, and risk score. Where a workflow does not fit one of the three cleanly, document the hybrid explicitly. Add to the next risk-committee pack a short table mapping each material AI workflow to its model, hosting jurisdiction, orchestration layer, and named owner. The same table will form the first page of the EU AI Act 2 August submission, so producing it once in May is less work than producing it again in late July under deadline.

For Finance Directors and Non-Specialist Leaders

Put one question to the technology lead at the next leadership meeting. For each AI tool the organisation is paying for today, where does the data physically go when it is processed, and is a self-hosted version of the tool available if the data-residency picture changes? The answer does not need to lead to a change of supplier. It needs to be known, written down, and dated. If the supplier offers no self-hosted option and has no plans for one, that is information worth carrying into the next renewal. If a self-hosted option exists, whether to use it is a separate decision, taken on its own merits.