Willem van den Ende is an independent consultant in Amsterdam who has worked as a software developer for more than twenty years. On 28 April 2026 he posted on LinkedIn that he had cancelled his Claude Max subscription, which was costing him one hundred US dollars a month. The next day he published a piece on his own website setting out the local setup that had replaced it. The hardware is a refurbished MacBook Pro M3 Max with 64 gigabytes of RAM. The models are open weights, served by llama.cpp through a script that rebuilds against current releases. The coding harness is Pi.dev. Routine chat and brainstorming run through GPTEL inside Emacs.
His reason for switching is set out plainly. Over the month before he wrote, the open-weights models he uses had become roughly twice as fast on the same hardware, with capability he judged sufficient for the work he uses them for. The hosted subscription, on his account, was no longer doing enough that local could not do to justify the monthly cost.
Source: Willem van den Ende: My local agentic dev setup today · The Register: How to roll your own local AI coding agents · SD Times: Coder Agents launch
Three further events in the past fortnight are worth holding against the van den Ende piece. On 2 May, The Register, the UK's principal mainstream technology publication for working IT professionals, ran a guide to setting up local AI coding agents. On 6 May, in a live blog from Anthropic's Code w/ Claude conference, Simon Willison reported that managed multi-agent orchestration and persistent Claude Code routines were being released as first-party Anthropic features. On 8 May, the developer-environment company Coder launched Coder Agents in beta, a self-hosted runtime designed to keep both source code and prompts inside the customer's network perimeter. The van den Ende piece, the Register guide, and the Coder launch are points along the same line. The Anthropic announcement is at the other end of it. Both ends of the spectrum gained credible new options in the same fortnight, while the hosted-plus-vendor-orchestration middle, where most procurement specifications are currently written, did not.
For most of the last two years, the procurement question for AI tooling has been which hosted vendor to choose. The choice between hosted and self-hosted was not, in practice, a live one. Three arguments did most of the work in keeping it that way.
None of this is an argument for moving to a local-first posture. For most organisations the hosted default will remain the right choice for some time. The argument is for refreshing the working that supports the current decision. A procurement choice that was right in late 2024 may still be right in May 2026, but the comparison should be on current numbers and on file. For organisations subject to public-sector procurement scrutiny, charity-funder requirements, or sector-specific data-residency rules, an out-of-date comparison is itself becoming an audit finding.
The supplier-side conversation is harder to evidence in public, but the pattern is starting to show. In recent client work, Intelligent Resilience has seen three vendors that previously offered only hosted AI features add self-hosted or hybrid options to their roadmaps in the past month. The reasons are not in any press release. The trigger appears to be procurement conversations with European public-sector and regulated-industry buyers. Renewal discussions in 2026 are a reasonable point at which to ask suppliers whether self-hosted options exist or are planned. The answer is useful regardless of whether the option is taken up.
In most of the risk registers Intelligent Resilience has reviewed in 2026, generative AI sits as a single line, with a single owner and a single risk score. That treatment was defensible while the architectural choices behind every AI workflow in the organisation were broadly the same. It is becoming less so. Three architectural patterns are now in regular use, and each carries a different data-residency, audit, and blast-radius profile.
The first is the fully managed pattern: a hosted frontier model, vendor-supplied orchestration, with prompts and outputs traversing the vendor's infrastructure. The second is the self-hosted pattern: an open-weights model running on infrastructure the organisation controls, with orchestration on the same network. The third, and currently the most common in practice, is the hybrid pattern: a hosted model accessed through self-hosted scaffolding, with routing to different models depending on the sensitivity of the input. A single risk-register line cannot honestly describe three different architectures with different control profiles, and trying to do so makes the register harder to defend in an audit.
The EU AI Act high-risk deadline of 2 August, set out in Edition 11, is a useful forcing date for closing this gap. The Act's obligations on data governance, technical documentation, and human oversight are easier to discharge against a documented architecture than against an undocumented one. Documenting the architecture in May, while the work is not under deadline pressure, is likely to produce a better result than documenting it in late July, when it will be.
Source: EU AI Act high-risk obligations: implementation timeline
Three publications in the past fortnight have proposed layered models of the modern AI agent stack. Aishwarya Naresh Reganti's The AI Agent Stack in 2026 in The Nuanced Perspective; the Glean engineering blog's emerging agent architecture; and MindStudio's six layers every AI builder needs to understand. The three models overlap in places but do not coincide. None of them maps directly onto the eight-layer taxonomy that Intelligent Resilience uses for its own assessments. This is normal at this stage of a field. Public taxonomies in a fast-moving area tend to proliferate before they consolidate, and engineering-led taxonomies typically settle first.
The IR taxonomy is governance-oriented. The public ones are largely engineering-oriented. The two will need to be reconciled eventually in contracts, specifications, and audit reports, and the side whose definitions are written down at the time of the conversation will have the easier argument to make. For organisations drafting AI-related procurement language in 2026, the practical recommendation is to attach a short internal glossary of the terms used in the specification, agreed within the organisation before it is sent to suppliers. The cost of doing so is small. The cost of working from incompatible definitions of agent, orchestration, or credentials in a 2027 contractual dispute is not.
Source: Reganti: The AI Agent Stack in 2026 · Glean: The emerging agent architecture · MindStudio: The six layers of agent infrastructure
Commission a current one-page comparison of the hosted, self-hosted, and hybrid patterns for the two or three AI workflows the organisation depends on today. Three columns: cost over twelve months, data-residency posture, and operational complexity scored against the team you actually have. The aim of the exercise is not to switch patterns. It is to have current working on file, so that when procurement, audit, or a funder asks why the organisation is on the pattern it is on, the answer can be produced quickly and with evidence. If the most recent version of that working is more than nine months old, treat it as out of date.
Split the single generative-AI line on the risk register into three entries, one per architectural pattern in current use, each with its own owner, residency description, and risk score. Where a workflow does not fit one of the three cleanly, document the hybrid explicitly. Add to the next risk-committee pack a short table mapping each material AI workflow to its model, hosting jurisdiction, orchestration layer, and named owner. The same table will form the first page of the EU AI Act 2 August submission, so producing it once in May is less work than producing it again in late July under deadline.
Put one question to the technology lead at the next leadership meeting. For each AI tool the organisation is paying for today, where does the data physically go when it is processed, and is a self-hosted version of the tool available if the data-residency picture changes? The answer does not need to lead to a change of supplier. It needs to be known, written down, and dated. If the supplier offers no self-hosted option and has no plans for one, that is information worth carrying into the next renewal. If a self-hosted option exists, whether to use it is a separate decision, taken on its own merits.