The claim worth interrogating

Somewhere between the hype cycle for large language models and the very real anxiety about datacenter power consumption, a quieter argument has been gaining traction: what if you ran AI inference *inside the database*, rather than shipping data out to a cloud API?

The Register flagged this framing in late May 2026, anchoring it to PostgreSQL — the open-source relational database that underpins a significant share of enterprise data infrastructure — and to two distinct pressures: energy costs and data sovereignty.

Both pressures are real. The inference-in-Postgres response to them is architecturally coherent. Whether it actually delivers at scale is a different question, and the honest answer right now is: we don't fully know yet.

What 'AI in Postgres' actually means

To be precise about the architecture: this is not about storing AI model weights in a database table. The proposal is to run inference — the process of passing input data through a trained model to get a prediction or generated output — as a native operation within the database engine, using extensions or embedded runtimes.

Postgres has a mature extension system. Projects like `pgvector` (for storing and querying vector embeddings, the numerical representations that underpin semantic search and retrieval-augmented generation) have already demonstrated that ML-adjacent workloads can live comfortably inside Postgres. The next step — running a small model's forward pass directly in the database process — is technically feasible with runtimes like ONNX or llama.cpp.

The practical implication: a query that previously required an application to call an external API, wait for a response, and pipe the result back into a database transaction could instead execute entirely within the database server.

The energy argument

The 'billion AI agents' framing in the source headline is doing rhetorical work, but it points at something real. Aggregate inference demand — across enterprise applications, consumer products, and the autonomous software agents now being deployed at scale — is a material contributor to datacenter power load.

Centralised inference clusters are energy-intensive in ways that go beyond the GPU compute itself: cooling, power conversion losses, and the network infrastructure required to serve millions of concurrent requests all add overhead. Distributed inference, run closer to where data already lives, theoretically reduces some of that overhead.

The word 'theoretically' is doing load-bearing work in that sentence. Published, independently verified energy comparisons between centralised cloud inference and Postgres-native inference for equivalent workloads are not, to my knowledge, in the public record as of this writing. The efficiency argument is plausible from first principles. It has not been rigorously quantified in this specific context.

The sovereignty argument

Data sovereignty — the legal principle that data is governed by the laws of the jurisdiction where it is stored or processed — is a more concrete and immediately actionable driver.

For organisations subject to GDPR in the EU, sector-specific regulations in financial services or healthcare, or national data-localisation requirements in markets like India or Brazil, sending data to a US-based cloud AI API is not always legally straightforward. Contractual safeguards exist, but they add compliance overhead and residual risk.

Running inference on-premises, or in a jurisdictionally appropriate cloud region, directly addresses this. If the model runs inside a Postgres instance that never leaves a compliant environment, the data-egress problem largely disappears. This is a genuine enterprise use case, not a theoretical one — procurement teams are already asking these questions.

What Postgres can and can't do here

Postgres is a credible platform for this kind of work. Its extension architecture is flexible, its community is large, and its deployment footprint is enormous. The `pgvector` extension's rapid adoption suggests the ecosystem can absorb ML-adjacent tooling quickly when the use case is clear.

The constraints are also real. Postgres is optimised for transactional and analytical query workloads, not for the memory bandwidth and parallelism that large model inference demands. Embedding a capable language model inside a database process raises questions about resource contention, operational complexity, and the size of models that are actually practical in this context.

Small, task-specific models — classifiers, embedding generators, lightweight generative models — are much more plausible candidates for in-database inference than frontier-scale models. The use cases that fit this architecture well are probably narrower than the broadest version of the pitch implies.

The gap between the pitch and the evidence

The framing — Postgres as an answer to the datacenter energy crisis — is ambitious. The underlying ideas are worth taking seriously. But 'an answer' to a crisis implies demonstrated, measurable impact, and that bar has not been cleared in the public literature for this specific approach.

What is supportable: in-database inference is a legitimate architectural pattern that addresses real data-sovereignty requirements and could, in principle, reduce some categories of inference-related energy overhead. What is not yet supportable: that it constitutes a meaningful solution to grid-scale AI power demand.

The distinction matters. Smart organisations evaluating this architecture should do so on the sovereignty and latency merits, which are solid, rather than on energy savings claims that remain unquantified.