AI is no longer a side experiment. It is already part of products, workflows, and day-to-day operations. So the question is not whether companies should use AI, but how to use it responsibly at scale. From our perspective, it starts with efficiency: getting the same or better results with less compute, less energy, and a lower environmental footprint.
The way models are chosen affects energy use and emissions, and as expectations around transparency grow, it's important to have a more solid way to explain those choices. The good news is that reducing AI’s footprint does not mean slowing innovation. It means running better systems: smaller models, faster inference, and clearer measurement.
Before we go further, let’s start with a few facts about AI and sustainability.
These figures are clear. Behind powerful AI tools, there is an environmental cost. AI systems require large amounts of natural resources and contribute to greenhouse gas emissions. Understanding how AI impacts the environment is important because the choices made now will shape whether AI becomes a tool for sustainability or a source of greater environmental pressure.
When discussing environmental impact, it is often easy to overlook which parts of an AI model’s lifecycle affect the environment. In general, the focus is usually placed on the impact of using AI tools, especially the energy required by large data centers. However, environmental effects occur across every stage of the AI lifecycle.
For example, it is not only during the training of AI models, but also during deployment and even the earlier stages required to make AI possible, such as material extraction, equipment manufacturing, cooling, networking, and storage. The environmental costs of AI can take different forms, including the use of natural resources such as energy, water, and minerals, as well as greenhouse gas emissions.
Energy is one of the most visible resources used by AI systems. AI models require significant amounts of electricity not only during training, but also during deployment, allowing users to interact with them in real time. This process requires physical hardware. The amount of hardware needed depends on the size and optimization of the model: some models may be deployed using a single GPU (Graphics Processing Unit), while others may require multiple GPUs and, therefore, more energy. As a result, data centers are energy-intensive facilities. They concentrate powerful computing hardware, which can account for around 40–50% of a data center’s total energy use, as well as networking systems, storage, and cooling infrastructure, which can account for around 30–40%.
The environmental impact of this energy use depends largely on where the electricity comes from. If data centers are powered by fossil fuels, AI can contribute to higher greenhouse gas emissions. If they use renewable energy sources, the impact can be reduced. As AI becomes more widely used, improving energy efficiency, increasing transparency from large technology companies, and expanding the use of clean energy will be important for reducing its environmental footprint.
One aspect mentioned earlier is the energy used by cooling infrastructure in data centers. However, cooling does not only require energy; it can also require large amounts of water. Depending on the data center infrastructure, water use can range from 0.18 to 1.1 liters per kWh of energy. Water is used to remove heat through cooling systems, and it often needs to be clean to prevent damage to cooling pipes and equipment. Moreover, a significant portion of this water can evaporate due to high temperatures, meaning it does not always return to the same cycle. Water is also used in other stages of AI hardware production, such as semiconductor manufacturing for GPU chips, where it is needed for cleaning and sterilization, and, to a lesser extent, for generating energy.
Source: https://arxiv.org/pdf/2304.03271 To manufacture the chips and hardware on which AI models run, large amounts of metals are required, including aluminum, copper, tin, tantalum, lithium, gallium, germanium, palladium, cobalt, and tungsten. Extracting these materials can have significant environmental impacts, as mining often requires high energy use, large amounts of water, and the removal of soil and vegetation. It can also contribute to habitat disruption, pollution, and waste from mining processes. As demand for AI hardware grows, the need for these minerals may increase, making responsible sourcing, recycling, and more efficient hardware design important for reducing AI's environmental footprint.
One of the most common sources of greenhouse gas emissions is electricity generation, which, as mentioned earlier, is required across different stages of AI. Emissions can also be produced during the manufacturing of specific materials, such as concrete and metals used to build data centers and the hardware infrastructure that supports AI systems.
Now that we understand how AI can impact the environment, the next question is how we can measure that impact. However, to do so, we first need to understand that there is no single, fixed footprint for an “AI request.”
Source: https://arxiv.org/pdf/2311.16863 Perfectly measuring every aspect of sustainability is not possible. Still, it is worth tracking what can be measured and making the necessary updates as better data becomes available. So, it is time to look at the numbers.
This formula calculates the energy consumption of a query iii at the lower and upper utilization bounds. First, it calculates the total inference time in hours, denoted as TiT_iTi . Then, the formula multiplies this time by the effective power used by the hardware: PGPU×UGPU,min,maxP_{\text{GPU}} \times U_{\text{GPU},{\min,\max}}PGPU×UGPU,min,max represents the GPU power draw under the lower or upper utilization assumption, while Pnon-GPU×Unon-GPUP_{\text{non-GPU}} \times U_{\text{non-GPU}}Pnon-GPU×Unon-GPU represents the power draw from non-GPU components such as CPU, memory, networking, and storage. Finally, the result is multiplied by PUE\text{PUE}PUE , which accounts for additional data center overhead such as cooling and power distribution.
This formula calculates the water consumption of a query in liters by separating the impact into on-site and off-site components. EqueryPUE⋅WUEsite\frac{E_{\text{query}}}{\text{PUE}} \cdot \text{WUE}{\text{site}}PUEEquery⋅WUEsite estimates the water used on-site at the data center, mainly for cooling; EqueryPUE\frac{E{\text{query}}}{\text{PUE}}PUEEquery isolates the IT energy consumed by the computing equipment, and this value is multiplied by WUEsite\text{WUE}{\text{site}}WUEsite , the data center’s on-site water usage effectiveness in liters per kilowatt-hour. The quantity Equery⋅WUEsourceE{\text{query}} \cdot \text{WUE}{\text{source}}Equery⋅WUEsource estimates the off-site water consumption associated with generating the electricity used by the query, where WUEsource\text{WUE}{\text{source}}WUEsource represents the water intensity of the electricity source. Adding both terms gives the total estimated water consumption for the query.
It calculates the carbon emissions of a query in kilograms of carbon dioxide equivalent. EqueryE_{\text{query}}Equery represents the energy consumed by the query and CIF\text{CIF}CIF is the carbon intensity factor of the electricity supply, usually expressed in kgCO2e/kWh\text{kgCO}_2\text{e}/\text{kWh}kgCO2e/kWh . By multiplying, it estimates the amount of greenhouse gas emissions associated with running that query.
Tools for individual testing:
The formulas above help quantify part of AI’s environmental impact, but they also raise a broader question: how does energy use affect the performance of AI systems themselves?
On the one hand, using more energy can improve the quality of AI outputs. This relationship has been widely studied through scaling laws, which show that increasing compute during training and, in some cases, during inference can lead to better model quality. Larger models, longer training runs, and more complex inference strategies can all improve the accuracy, reliability, or usefulness of predictions.
However, more energy does not mean more performance. A system that produces high-quality results but requires more time, larger hardware, higher compute costs, and greater energy consumption may not be efficient overall. Higher energy use can also increase the environmental impact of AI by requiring more resources to build, run, and cool the servers that support these systems.
> Performance is the combination of quality and efficiency.
At a practical level, efficient AI means achieving the same results with fewer resources. Even when sustainability is not your main priority, optimizing energy use remains important because it directly affects overall system performance, including cost, speed, scalability, and hardware requirements.
By reducing environmental impact without requiring users or developers to do less, it shows that sustainable AI is not only about limiting, but also about designing systems that are faster, more scalable, and less resource-intensive. In this sense, improving efficiency can benefit both the environment and the performance of AI systems, making it a practical and necessary direction for the future.
This is one of our key motivations behind sustainable AI: aligning environmental goals with broader performance incentives so that better engineering choices lead to lower impact.
There is no single, one-size-fits-all approach to reducing the environmental impact of AI models. At Pruna, we believe that sustainable AI starts with efficient AI, and we work across several areas to make this possible.
At Pruna, we offer highly optimized models through our P-models family. They are smaller, faster, and more energy-efficient than many other released models, while still maintaining strong quality. This includes P-Image, P-Image-Edit, and P-Video, among others, which are 3 to 6 times more energy efficient than other models for the same tasks.
In addition, we provide optimized endpoints through our API and through other vendors, making the models more lightweight and easier to integrate into different environments. This reduces hardware requirements and energy consumption without compromising usability. Some examples are Wan 2.2 or Flux 2.
Check our P-models
[here].
If none of the provided models meet your needs, we also offer tools to make your preferred model smaller and more efficient. The [OSS Pruna package](https://github.com/PrunaAI/pruna) is a model optimization framework that helps developers build faster and more efficient models with minimal overhead. It provides a comprehensive suite of compression techniques (caching, quantization, pruning, distillation, compilation, kernels, or recoverers) that can be easily combined without requiring complex manual integration.
Check the
[Pruna]Github repository We also collaborated with different initiatives and communities to promote AI efficiency beyond our own work.
For instance, we have been running AI efficiency meetups and webinars where we discuss this topic with pruners, as well as with invited speakers from the broader AI and sustainability community. In addition, we have collaborated with other organizations. For instance, we hosted a community event with CodeCarbon and EcoLogits, where participants could learn, exchange ideas, and discuss practical ways to measure and reduce the environmental impact of AI. We also supported the 1st International Challenge on Compression of AI Models, aiming to contribute to sustainable AI by encouraging participants to optimize models.
To measure the environmental impact, we integrated our runs with CodeCarbon and used their dashboard to track the results. We also estimated the energy use and CO₂ emissions avoided by comparing our optimized models with their base versions: what would have been consumed without optimization versus what was actually required when using Pruna.
These are the results we achieved over the past year for a single provider.
A quick disclaimer: making AI more efficient is only one part of sustainable AI. Efficiency improvements can sometimes lead to more overall usage, known as the rebound effect. We should also ask whether AI is needed for every task, because in many cases, simpler solutions may be enough.
In this blog, we analyzed how AI impacts the environment, the stages where this impact occurs, and the main costs associated with it. We then explored how to measure this impact, showing that although results can vary depending on the prompt, task, deployment setup, and other factors, existing formulas can still help provide useful estimates. Finally, we presented what we are doing at Pruna through our efficient models, open-source package, and community events, and shared some of the results we have achieved.
Falk, S., Ekchajzer, D., Pirson, T., Lees-Perasso, E., Wattiez, A., Biber-Freudenberger, L., Luccioni, S., & van Wynsberghe, A. (2025). More than Carbon: Cradle-to-Grave environmental impacts of GenAI training on the Nvidia A100 GPU. arXiv. https://doi.org/10.48550/arXiv.2509.00093
Jegham, N., Abdelatti, M., Koh, C. Y., Elmoubarki, L., & Hendawi, A. (2025). How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference. arXiv. https://doi.org/10.48550/arXiv.2505.09598
Luccioni, S., Trevelin, B., & Mitchell, M. (2024). The Environmental Impacts of AI — Policy Primer. Hugging Face Blog. https://doi.org/10.57967/hf/3004
Luccioni, S., Jernite, Y., & Strubell, E. (2023). Power Hungry Processing: Watts Driving the Cost of AI Deployment? arXiv. https://arxiv.org/pdf/2304.03271