How much energy does AI really use in training and daily use?

I’m trying to understand how much electricity modern AI models actually consume, both during large-scale training and when people use them every day. I see headlines claiming AI is an energy hog, but I can’t tell what’s hype versus real data. Are there any concrete examples, benchmarks, or tools I can use to estimate the energy use and carbon footprint of different AI workloads for a small company deciding whether to adopt more AI services?

Short version. Training uses a ton of energy for big models, daily use is smaller per request but adds up at scale.

Some rough, real numbers to ground this.

  1. Training big models

Take a frontier LLM, like GPT‑4 class.

Public estimates from papers and industry talks put big runs around:
• 1 to 5 gigawatt‑hours (GWh) of electricity for one full training run
• That equals the yearly electricity use of a few hundred US homes
• CO₂ emissions, if powered by average US grid, can land in the low thousands of tons

Example back‑of‑the‑envelope:
• 10,000 GPUs
• 300 watts per GPU (only the chip, rest of system adds more, but keep it simple)
• 24 hours per day
• 30 days

Energy ≈ 10,000 × 0.3 kW × 24 × 30 ≈ 2.16 GWh

Real runs often run longer, use more GPUs, and you need to multiply by a factor for cooling and overhead, usually 1.2 to 1.6 PUE.

So, yes, big training runs burn through serious electricity. This is where the scary headlines come from. But there are not thousands of these mega‑runs per week. They are rare and expensive.

  1. Inference, daily use

This is what you use when you chat with an AI or call an API.

Per query, the energy is much smaller, but volume is huge.

Some approximate numbers people in the field quote:
• A single LLM request might consume from a few watt‑seconds up to a few watt‑minutes per 1,000 tokens, depending on model size and hardware
• Converting that, you land around 0.01 to 0.1 Wh per medium‑length reply on efficient hardware
• That is in the same ballpark as a few seconds of using a modern GPU for gaming, or a few minutes of a high‑end smartphone screen on max brightness

If a service handles millions of queries per day, you multiply that, and it becomes a real data center load.

  1. How it compares to other stuff

Some simple comparisons to help your intuition.

• One big LLM training run, a few GWh
Roughly similar to:
– several million Google searches
– tens to hundreds of thousands of km of driving a gasoline car
– a few hundred transatlantic flights worth of passenger share

• One AI chat response, around 0.01 to 0.1 Wh
Roughly similar to:
– loading a media‑heavy web page on a laptop a few times
– sending or reading a small batch of emails
– streaming a few seconds of HD video

So, the main story:
• Training is heavy, but rare
• Usage is light per request, but frequent and scaling fast

  1. What makes the numbers go up or down

You asked what is hype vs real concern, so here is where your attention matters.

Hardware type:
• Older GPUs waste more power per operation
• Newer accelerators (like H100, TPU v5) run more efficient, so same workload uses less electricity

Model size:
• 7B parameter models use a lot less energy than 70B or 500B
• Distilled or quantized models reduce power further

Batch size and throughput:
• Data centers run GPUs near full load to improve efficiency
• Half‑idle GPUs waste energy, because they still draw non‑trivial power

Location and grid mix:
• Two data centers using the same electricity amount might have different CO₂, depending on how “clean” the local grid is
• If grid is heavy on coal, emissions per kWh go up
• If grid has more wind, solar, nuclear, emissions go down

Cooling and infrastructure:
• Power Usage Effectiveness (PUE) shows how much extra energy goes into cooling and overhead
• Good data centers have PUE around 1.1 to 1.3
• Bad ones can hit 1.8 or worse
• That means for 1 kWh on chips, the facility might draw anywhere from 1.1 to 1.8 kWh from the grid

  1. Headlines vs reality

Some common claims and how to read them.

“AI will use as much electricity as a small country”
• This kind of quote often extrapolates aggressive growth without assuming hardware or efficiency gains
• Global data centers today use around 1 to 2 percent of world electricity
• AI is part of that, not the whole thing, but the share is growing fast

“Training one model equals X flights to London”
• These comparisons use a rough carbon factor per kWh
• They are not wrong for order of magnitude, but they ignore improvements in training reuse, fine‑tuning, and hardware efficiency

“Using ChatGPT is worse than flying”
• Per query, no
• Per global usage trend, the concern is more about aggregate impact if billions of people use it all day, plus growing models and services built on top

  1. What you can do as a user or developer

If you are a user:
• You do not need to feel bad about a few AI queries per day, the impact is similar to other online activity
• The biggest lever is still your transport, heating, and diet, not your AI chats

If you are a developer or org:
• Pick smaller models when they work
• Run models on efficient hardware
• Prefer regions with cleaner electricity mix if possible
• Measure energy per 1,000 tokens or per request, not only latency and cost
• Cache results and avoid wasteful repeated calls
• For training, share checkpoints, reuse pretraining, focus on fine‑tunes instead of full retrains when possible

  1. If you want to estimate yourself

Rough formula for inference:

Energy per query (Wh)
≈ GPU power (W) × time (s) ÷ 3600 × GPU share

Example:
• 300 W GPU
• Your query keeps it busy for 0.5 seconds full load
• Your model uses half the GPU

Energy ≈ 300 × 0.5 ÷ 3600 × 0.5 ≈ 0.021 Wh

You can then multiply by number of queries per day and by a CO₂ factor for your grid (for US average, people often use around 0.4 to 0.5 kg CO₂ per kWh, though this varies by region).

So, are the “AI is an energy hog” headlines totally off
Not really, but they are often missing context. Frontier training is heavy, yes. Daily usage is modest per request, but the scale of usage and the trend line matter a lot.

If you share what kind of AI use you care about, like personal usage, running a startup model, or policy level stuff, you can get more tailored numbers.

The short answer: yes, AI uses a lot of energy, but the headlines are… selectively dramatic.

@boswandelaar already laid out good ballpark numbers, so I’ll zoom out a bit and hit different angles instead of redoing the same math.

1. Training vs “lifetime” use

Where I slightly disagree: people often overfocus on “one giant training run = X flights” and forget that big models are reused like crazy.

Roughly:

  • A frontier model might cost a few GWh to pretrain.
  • That same model then serves billions of queries.
  • So per user or per query, the training portion amortizes down to tiny fractions of a Wh.

If a model lives long enough and is heavily used, inference energy can actually dominate over time, not training.

2. Inference is small per call, but the shape matters

Per query, numbers like 0.01 to 0.1 Wh are reasonable. Where it gets tricky:

  • Long conversations with lots of tokens cost more than “one-shot” small prompts.
  • Big multimodal models (images, long context windows) are noticeably heavier than tiny text-only models.
  • “Free” features built into products can quietly explode usage, because friction goes to zero and suddenly every click fires an AI call.

So, your occasional chat is minor; a company wiring LLMs into everything a billion users do is where the grid starts to care.

3. Compared to other stuff you do in a day

If you personally:

  • Drive a car: that dwarfs your AI usage.
  • Heat/cool a large home with fossil fuels: same story.
  • Stream hours of video: also significant.

For an individual:

  • A handful of AI queries per day is in the ballpark of “normal internet use” in energy terms, not some wild outlier.

So the “using ChatGPT = flying to London” type lines are just… no.

4. The real risk is scale + expectations, not one model

Where I’m probably more pessimistic than @boswandelaar:

  • Every year: more models, bigger context windows, higher quality, more always-on assistants.
  • Businesses are pushed to add AI to everything, often where it is barely needed.
  • Data center power demand is already forcing grid upgrades in some places.

If AI usage keeps scaling faster than hardware efficiency and grid decarbonization, it can absolutely become a major load, not just “some percent of data centers.”

5. “Which numbers should I actually trust?”

Simple sanity checks:

  • If someone claims “this model uses as much power as a country,” ask:
    • Over what time period?
    • Is that extrapolated growth or current use?
    • Did they account for newer, more efficient hardware?
  • If someone compares to flights:
    • Check if they give a kWh figure and an emissions factor, or just vibes.

Most credible stuff gives:

  • kWh or GWh
  • Hardware type (H100, A100, TPUs, etc.)
  • PUE or some mention of cooling/overhead
  • Grid assumptions (renewables share, region)

If it does not mention any of these, treat it as a rhetorical piece, not a measurement.

6. What this means for you concretely

If you’re just a user:

  • Your AI footprint is roughly in the same class as your other online tech habits.
  • If you care about climate, your top levers are still transport, food, heating, and electricity provider, not “using fewer LLM prompts.”

If you’re building with AI:

  • Biggest win: use smaller models where possible.
  • Don’t spam calls “just because we can.”
  • Monitor energy per request like you monitor latency and cost.
  • Think about caching, batching, and offloading some work to local / on-device models where practical.

So yes: AI has a nontrivial energy cost, especially at scale and especially for massive training runs. But if you see a headline implying that asking a chatbot a question is some kind of eco-crime, that’s more storytelling than physics.

A useful way to think about “how much energy AI really uses” is to compare where the watts go rather than just quoting big training numbers.

1. Where the energy actually flows

Ignoring networking and storage, most of the power in AI workloads hits three buckets:

  1. Training compute
  2. Inference compute
  3. Overhead in the data center (cooling, power conversion, etc.)

@boswandelaar already walked through solid back‑of‑the‑envelope numbers per training run and per query. I mostly agree, but I think they underplay one subtlety: the shape of infrastructure makes some models dirtier per joule than others, even at the same nominal kWh.

Example:
Two identical models could each “cost” 3 GWh to train. One runs in a modern hyperscale data center with a PUE around 1.1 and a relatively clean grid. The other is on an older colocation facility with PUE 1.6 on a coal‑heavy grid. Same FLOPs, wildly different real‑world impact. Headlines rarely distinguish that.

2. Training: why the same 3 GWh can be “cheap” or “expensive”

I only partially buy the “amortize training so it is tiny per user” story:

Pros of the amortization view

  • It is correct physics: once a model is trained, the training energy is a sunk cost. Spread across billions of uses, it really does fade into the noise per query.
  • It helps avoid scare stories like “this model = 10,000 flights” without context.

Cons

  • It can hide incentive problems. A lab can always argue “we will serve a billion users later,” which makes every new 10× bigger run look “fine” in theory.
  • Frontier labs may retire or replace models faster than expected. If a model is superseded after 6–12 months, you never actually get full amortization.
  • There is a portfolio effect. Dozens of massive runs stacking up every year is not the same as “one huge model reused forever.”

So: yes, per‑user impact of training is likely small if the model is truly reused at scale, but system‑wide, a fast frontier race still translates into a lot of new GWh each year.

3. Inference: small per call, big in aggregate

I agree with @boswandelaar that a single query is roughly in the “normal internet use” band. Where I diverge is how quickly that can cease to be comforting:

  • Multiply 0.02 to 0.1 Wh per LLM call by tens of billions of calls per day.
  • Add long context windows, retrieval, image generation, audio, code interpreters.
  • Then bolt AI into office suites, search, social feeds, customer support, etc.

At that point, inference energy is not just “some extra JavaScript” compared to browsing. It becomes a major pillar of data center load, especially if companies treat “AI everywhere” as a default UX.

4. Why some AI use is worse than others

Instead of asking “Is AI an energy hog?” I’d split usage into three rough categories:

  1. High value, low volume

    • Scientific research, assistive tech, specialized tools.
    • Even if energy per query is high, overall volume is low and social value arguably high.
  2. Medium value, medium volume

    • Coding assistants, productivity tools, language help.
    • This is probably where the best cost/benefit balance lives today.
  3. Low value, high volume

    • Autogenerated spam, pointless “AI summaries,” flavor‑of‑the‑month novelty features.
    • This is where energy and emissions are hardest to justify.

From a climate standpoint, pushing back on category 3 matters more than shaming individuals for chatting occasionally.

5. What to watch in the headlines

Instead of focusing only on training vs inference, I’d watch these trends:

  • Data center siting
    Are new AI clusters going into regions with grid constraints or fossil‑heavy mixes, or into places with strong renewables buildout?

  • PUE and hardware generation
    Newer data centers and accelerators can realistically halve or better the energy per token over time. Claims that ignore generation differences will overestimate cost.

  • Water use and local impact
    Cooling AI clusters can stress local water resources. This does not show up in “kWh per query” but absolutely matters.

  • Demand response
    Training can be somewhat flexible in timing. If operators tie training schedules to periods of renewable surplus, the same kWh can come with less marginal emissions.

If an article shouts about AI’s “carbon footprint” but never touches grid mix, PUE, or hardware generation, take it as rhetoric rather than analysis.

6. What this means in practice

For individual users:

  • Your personal AI usage is unlikely to be the main driver of your footprint compared to transport, home energy, and diet.
  • If you care about impact, prioritize:
    • Lower‑carbon electricity for your home
    • Less car/plane travel
    • Heat pumps and insulation
      before worrying about “did I ask three or ten questions today.”

For teams building products:

  • Default to smaller or distilled models wherever they are “good enough.”
  • Avoid background or always‑on AI calls fired on every scroll or keystroke.
  • Batch, cache, and consider local/on‑device models for simple tasks.
  • Track energy and emissions alongside latency and cost. Vendors increasingly expose per‑request energy or carbon estimates.

7. On “how much electricity do modern AI models consume” as a global thing

If AI build‑out keeps going at the current pace, AI could end up:

  • A noticeable fraction of global data center power (which itself is a growing slice of total electricity), and
  • A localized stressor on certain grids and water systems.

Is it the new coal plant on its own? No. Is it a load serious enough to be part of infrastructure and climate planning? Yes.

So I agree with @boswandelaar that sensational one‑off comparisons are misleading, but I’m more cautious about the long‑term path if “AI by default” becomes the design rule for everything online.

In short: one model training run is not the apocalypse; one chat with a model is not a climate sin; but the collective decision to stuff expensive inference into every user action is exactly where the energy story gets real.