6 June 2026 · Updated 7 June 2026 · open-weight models · AI strategy · AI since ChatGPT

Open-Weight vs Closed AI Models: Plain-English Guide

Open-weight vs closed AI models, compared in plain English: capability, cost, data residency and who each one suits, so you can pick the right fit.

By The Manchester PC team · Engineers & consultants, Manchester

Abstract comparison of a sealed, locked block against an open lattice of connected blocks with one highlighted in amber

For most small and mid-sized businesses a closed model over an API is the right starting point, but if your data must stay in-house or your volume is high and steady, an open-weight model can be the better fit.

If you run a business in Greater Manchester and you are weighing up which kind of AI to build on, this post is for you. By the end you will be able to tell open-weight and closed models apart, see them side by side, and decide which suits your constraints. It is part of the story of AI since ChatGPT.

We are a Manchester technology business that builds AI systems for clients, and we give independent, no-commission advice on this exact choice. The chatbot on this site runs on a closed model over an API, because that is the right fit for our own use.

Quick verdict: who should pick open, who should pick closed?

Pick a closed model if you want the best capability with no infrastructure to manage and pay-as-you-go costs; pick an open-weight model if data residency, high steady volume, deep customisation or offline operation are non-negotiable.

In plainer terms: most businesses, most of the time, should start with a closed model over an API. It is the lower-effort, faster route to a working system. Reach for open weights when a specific constraint, usually where your data is allowed to go, forces your hand.

The honest answer for most readers is that this is not a tribal choice. It is a fit-to-constraint choice, and the right pick depends on your situation. A great deal of online debate treats “open” and “closed” as causes to defend, but a business only cares whether the system does the job, keeps data where it is allowed, and costs what it should.

How do open-weight and closed models compare side by side?

The table below sets out the trade-off across the factors that actually move a decision.

Factor	Closed model (GPT, Claude, Gemini)	Open-weight model (Llama, Mistral, Gemma, DeepSeek)
Capability	Tends to lead at the frontier; newest features first	Strong and improving; some reasoning now frontier-grade
Cost shape	Pay per use; suits modest or spiky volume	Fixed server cost; suits high, steady volume
Data residency	Data sent to the provider’s API	Data can stay on your own machines or in a UK data centre
Maintenance	Provider handles updates, security and uptime	You own updates, security and uptime
Customisation	Limited; configure rather than retrain	Deep; fine-tune on your own data
Who it suits	Most SMEs wanting capability with low effort	Regulated, high-volume or offline use cases

What is the difference, in the terminology that matters?

A closed model lives on its maker’s servers; an open-weight model can be downloaded and run wherever you choose.

A closed model (GPT-5, Claude, Gemini) lives on its maker’s infrastructure. You send your data to their API, pay per use, and get the best available capability with nothing to manage. You can see the major providers at openai.com and anthropic.com.

An open-weight model (Meta’s Llama, Mistral’s models, Google’s Gemma, DeepSeek’s R1) can be downloaded and run on your own server, in a UK data centre, or even on a strong desktop for smaller models.

Note the precise term: open weight, not open source. The downloadable file is the trained model itself. The training data and full recipe usually stay private, and licences vary, so the wording matters when you make deployment decisions.

How did open-weight models get good enough to matter?

Open-weight models went from “noticeably worse” in mid-2023 to genuine contenders by 2025, driven by three releases.

Llama made open credible. Meta’s Llama 2, in July 2023, was the first heavyweight open-weight family cleared for commercial use, and later versions closed most of the quality gap. By 2025 the family had passed a billion downloads.

Mistral made open efficient. The French startup’s 7-billion-parameter model, in September 2023, outperformed models twice its size, and its Mixtral release in December 2023 brought sparse mixture-of-experts architecture (big-model quality at small-model running cost) into the open world.

DeepSeek made open frightening. In January 2025, DeepSeek-R1 delivered frontier-grade reasoning fully open, at a fraction of the expected training cost. It is the moment the industry stopped treating open weights as the second division. (For why running these models got so cheap, see why AI got cheap.)

By 2026, European players had taken it further, selling sovereign, self-hostable systems an organisation can run entirely inside its own walls.

What is the real business trade-off?

The real trade-off is capability and convenience versus control and data residency.

Closed models win on raw capability at the frontier, zero infrastructure burden, fastest access to new features, and pay-as-you-go economics that suit spiky or modest usage. For most SMEs, most of the time, a closed model over an API is the right answer. It is what we use for our own client chatbots.

Open-weight models win on data residency (nothing leaves your infrastructure, which is decisive in healthcare, legal and finance), predictable costs at high volume, deep customisation through fine-tuning, no vendor dependency, and offline or air-gapped operation.

Data residency is where this gets serious. If you handle personal data, where it goes is a legal question, and the UK Information Commissioner’s Office sets out the rules at ico.org.uk. Since obligations on general-purpose models began applying in 2025, those questions have only sharpened. Our IT consulting service helps clients map this out without selling them a particular model.

Cost deserves a closer look, because the headline numbers mislead. A closed API charges per request, so a quiet month costs almost nothing and a busy month costs more. A self-hosted open model flips that: the server bill is fixed whether it runs flat out or sits idle. At low or unpredictable volume the API usually wins. At high, steady volume the dedicated server usually does, once you have added in the cost of the people who keep it running.

That maintenance cost is the part businesses underestimate. A closed API needs nobody on your side; the provider patches, scales and secures it. A self-hosted model needs someone responsible for updates, security and uptime, and a self-hosted deployment is a snapshot in time. Closed frontiers improve almost monthly, so if you run your own model you have to decide, deliberately, when to upgrade.

When should you pick neither, and run a hybrid instead?

In practice the winning answer is often a hybrid: use both, matching each model to the task.

Here is what that looks like. Sensitive document processing runs on an open model in-house. Customer-facing chat runs on a frontier closed model. Routine high-volume classification runs on a small, cheap model. Each job goes to the model that fits its constraints, rather than forcing one model to do everything.

This handles the where-your-data-goes question cleanly, because the data that must stay in-house never leaves, while the work that benefits from frontier capability still gets it. It also softens the vendor-dependency worry, since no single provider holds your whole operation.

The catch is complexity. A hybrid means more moving parts to run, more places something can break, and a clear rule for which data is allowed to go where. That is worth it for a business with genuinely mixed needs, and overkill for one with a single, simple use case. For more on the rules shaping these choices, see our plain-English AI regulation timeline.

There is rarely a universally right answer. But there is usually a clearly right answer for a specific business with specific constraints. Working that out is exactly what our IT consulting service exists for. The consultation is free, and we will tell you straight if the boring option is the right one.

Frequently asked questions

What is the difference between open-weight and closed AI models?

A closed model runs on its maker's servers and you reach it through a paid API. An open-weight model can be downloaded and run on your own machines. The key practical difference is where your data goes and who is responsible for running the model.

Is open-weight the same as open-source?

No. Open weight means the trained model file is downloadable, but the training data and full recipe usually stay private, and licences vary. Some are genuinely permissive; others limit commercial use. The wording matters when you are making deployment and compliance decisions.

Are open-weight AI models as good as closed ones?

The gap has narrowed sharply. Llama, Mistral and DeepSeek-R1 brought strong, even frontier-grade reasoning into open weights. Closed models still tend to lead at the frontier and ship new features fastest, but for many business tasks an open model is now good enough.

Which is cheaper, an open-weight model or a closed API?

It depends on volume. Closed APIs charge per use, which suits modest or spiky workloads. A self-hosted open model costs roughly the same whether busy or idle, so it tends to win at high, steady volume once you account for the server and the people running it.

When should a business choose an open-weight model?

Choose open weight when data must stay in-house or in-country, when volume is high and steady, when you need deep customisation through fine-tuning, or when you need offline operation. These constraints are common in healthcare, legal and finance work.

Can I use both open and closed models together?

Yes, and many businesses do. A common hybrid runs sensitive processing on an in-house open model, customer-facing chat on a frontier closed model, and high-volume routine classification on a small cheap model. You match each task to the model that fits its constraints.