Raising Awareness of AI’s Climate Impact: Evaluating the Effectiveness of Climate Impact Overlays in Generative AI Interfaces Using Eye-Tracking Data
As generative AI tools, from large language models to diffusion-based image generators, become increasingly integrated into everyday work and creative practices, their environmental impact remains largely overlooked. This project investigates how visual eco-feedback can influence user behavior during AI-powered image generation. We designed a climate impact overlay positioned in the top right corner of the ChatGPT interface to surface the hidden energy costs of image generation. To understand its effect, we conducted a user study combining eye-tracking data collected via the Pupil Core system with qualitative insights gathered through post-study interviews. Our goal was to examine whether presenting climate-related information could prompt more environmentally conscious decision-making without disrupting the creative workflow.
Experiment Design
We conducted our study with 7 participants aged between 23 and 30 years, comprising 4 males and 3 females. Each participant engaged in two prompt-based image generation tasks using DALL·E through ChatGPT. The tasks were:
- Generate an image of a princess and a tech CEO in a palace.
- Generate an image of an Egyptian pharaoh and a robot in space.
Participants were informed that their generated images would be evaluated based on creativity and perfection. They were encouraged to iterate as many times as desired to produce their most refined and imaginative result.
The experiment employed a within-subjects counterbalanced design to mitigate carryover effects. Participants were divided into two groups:
- Group 1: Completed Task 1 without the climate overlay and Task 2 with the overlay. P1, P2, P3 and P7 belonged to Group 1.
- Group 2: Completed Task 1 with the overlay and Task 2 without it. P4, P5, P6 belonged to Group 2.
This ensured that each task was experienced both with and without environmental feedback across the participant pool.
Overlay Design
We developed a custom Chrome extension to introduce a climate impact overlay within the ChatGPT interface during image generation. This overlay design was informed by the eco-feedback timing framework, incorporating timing, information, and display dimensions (Figure 1).
The overlay was placed at the top-right corner of the interface and became visible each time an image was generated. It presented real-time feedback on the energy cost of that generation in terms of relatable units, including: 1) Number of smartphone charges 2) Number of Google searches and 3) Hours of LED light usage. The overlay flashed briefly upon update to draw user attention and maintained a cumulative count across iterations. This dynamic, visible feedback mechanism aimed to contextualize the environmental implications of each generation event intuitively.
Methods
Quantitative Methods
- Direct Metrics: We exported raw gaze coordinates and pupil diameter values from the Pupil Core system using Pupil Player.
- Derived Metrics: From the exported data, we computed higher-level gaze behavior metrics such as FixationCount, FixationsInAOI, GazePointsInAOI, Time to First Gaze (TTFG), TotalTimeInAOI (ms), Entropy, MeanPupilDiameter, PupilDiameterInAOI, MeanVelocity, and MaxVelocity.
Qualitative Methods
- Post-task Surveys: After completing both tasks, participants responded to a structured set of questions aimed at gauging their awareness and reflections on the climate information overlay.
Data Analysis
Eye Tracking Analysis
The Pupil Core system includes two core software tools: Pupil Capture and Pupil Player. We used Pupil Capture to calibrate each participant and record their eye-movement data during the image generation tasks. Upon completion of each session, the recordings were reviewed in Pupil Player, which allowed us to export gaze positions and pupil diameter data as CSV files for further analysis.
Each of the seven participants' data files were individually processed. First, we identified the Area of Interest (AOI) corresponding to the on-screen location of the climate information overlay. Using a custom Python script, we loaded a snapshot from the participant's world-view video and manually annotated the AOI for each participant in each task.
Raw data was cleaned by normalizing timestamps, filtering out samples with a confidence score below 0.6, and dropping entries with missing values. Following preprocessing, we visualized and analyzed gaze behavior using multiple metrics and visual tools:
- Gaze positions and gaze confidence over time
- Pupil diameter variation over time
- Gaze overlays on world-view videos (with and without fixation points)
- Gaze velocity over time
- Fixation density within and outside the AOI
- Heatmaps of gaze distribution
We then computed a comprehensive set of quantitative metrics for each participant and task:
- Fixation Count - Total number of times the eyes stopped and focused on a location — indicates attention points.
- Fixations in AOI - Number of fixations specifically within the overlay region.
- Gaze Points in AOI - Total number of raw gaze positions (data samples) that landed inside the AOI — reflects visual attention volume.
- TTFG (Time to First Gaze) - Time from when the screen appeared until the participant first looked at the AOI — indicates speed of attention.
- Total Time in AOI (ms) - Total time spent looking within the AOI, measured in milliseconds — shows how engaging or distracting the area was.
- Entropy - A measure of gaze randomness or dispersion — high entropy means attention was scattered, low means focused.
- Mean Pupil Diameter - Average size of the pupil during the entire task — can reflect cognitive load or arousal.
- Pupil Diameter in AOI - Average pupil size while looking at the AOI — useful for detecting emotional or mental reactions to specific content.
- Mean Velocity - Average speed of eye movements between points — shows overall scanning behavior.
- Max Velocity - Fastest recorded eye movement — may indicate sudden shifts in attention or screen scanning bursts.
These metrics were stored in a unified dataset (all_participant_metrics.csv) to enable comparative statistical analysis.
Cross-Task Comparative Analysis
In the analysis code, Task A was labeled as the condition with the climate information overlay, and Task B as the condition without it. For each extracted metric, we applied the Shapiro-Wilk test to assess the normality of the distribution. Based on the results, we performed either paired t-tests (for normally distributed metrics) or Wilcoxon signed-rank tests (for non-normal distributions).
These statistical tests enabled us to rigorously compare whether the presence of the climate overlay introduced statistically significant differences in gaze behavior, attention allocation, or cognitive engagement.
Data Analysis Results & Inference
Our analysis revealed several insights about participant interaction with the overlay:
- Although participants gazed at the overlay multiple times, they did not fixate on it, suggesting a lack of sustained attention.
- Metrics such as GazePointsInAOI and TotalTimeInAOI confirmed that users did look at the overlay area intermittently.
- The absence of fixations on the overlay may indicate that the overlay was either not visually salient or not considered relevant to the task.
- High variability in Time to First Gaze (TTFG) and Entropy suggested individual differences—some participants noticed the overlay quickly, while others appeared to ignore it.
- Pupil diameter and gaze velocity data showed no consistent trends across conditions.
- Overall, the overlay neither significantly distracted nor strongly attracted user attention.
- Given the small sample size (n=7), subtle behavioral effects may not have been statistically detectable.
Prompt Analysis
We analyzed the number of prompts participants submitted under both conditions to evaluate whether the climate overlay affected their iteration behavior.
Participant | No of prompts without climate information overlay | No of prompts with climate information overlay | Change |
P1 | 4 | 3 | -1 |
P2 | 2 | 2 | No change |
P3 | 6 | 4 | -2 |
P4 | 2 | 2 | No change |
P5 | 2 | 4 | +2 |
P6 | 4 | 3 | -1 |
P7 | 3 | 3 | No change |
Mean | 3.29 | 3 | -.29 |
The results indicate no substantial difference in the number of prompts used between the two conditions. The presence of the overlay did not significantly reduce the frequency of prompt iterations. Participant behavior appeared to reflect personal style more than sensitivity to the environmental feedback: some users preferred brief, direct prompts while others explored descriptive prompts regardless of the overlay. Even when participants acknowledged the climate information, they still prioritized achieving a satisfactory image, suggesting that the overlay did not strongly influence prompt length or completion strategies.
Qualitative Analysis
Participants shared a variety of reactions to the climate information overlay, offering insights into both its perceived value and design limitations.
Tensions Between Task Completion and Environmental Awareness
Participants expressed mixed reactions to the climate impact overlay. While P2 noted that “seeing the actual numbers made me think twice” about generating more images, they also admitted it conflicted with their goal, “I didn’t really get what I wanted.” Others such as P3 were less influenced, stating, “just saying that I’m using energy isn’t going to get me to stop.” One participant summed up a more task-driven mindset with, “Gotta do what you gotta do.” These responses suggest that while the overlay raised some awareness, it often wasn’t enough to meaningfully change behavior.
Perceived Trade-offs and Shifts in Strategy
Some participants reflected on how the climate overlay influenced their generation strategies, even if the outcomes were not always satisfying. P4 mentioned trying to reduce iterations by aiming for more precise prompts upfront: “Instead of doing multiple iterations, try and get it right the first time [...] but it didn’t give me what I want,” later concluding, “It’s not good enough.” Similarly, P7 noted a subtle behavioral shift, saying they would “probably be satisfied a bit earlier knowing [...],” suggesting that awareness of impact encouraged them to settle for results sooner than they might have otherwise.
Discussion and Future Directions:
Our findings suggest that while the climate impact overlay was not universally persuasive, it served as a behavioral nudge for some participants, prompting them to reduce prompt iterations or reconsider their approach to image generation. Some participants noted that the climate feedback influenced their behavior without aligning with their creative goals. Design clarity also emerged as a barrier: one participant mistook the overlay for an advertisement, highlighting the need for more intuitive visual integration. Emotional responses ranged widely, from guilt and cognitive dissonance to indifference, underscoring the complexity of influencing climate-aware behavior through interface design.
These reflections point to the importance of both clarity and contextual relevance in shaping user response. While a simple overlay may not be sufficient for broad behavioral change, it can prompt reflection and subtle shifts in decision-making for some users. Future designs should prioritize clearer visual cues, stronger contextual framing, and deeper integration into the user workflow to enhance impact without disrupting creative engagement. We also call for policymakers to take action in regulating and increasing transparency in high-emission AI use.
References
[1] Sasha Luccioni, Yacine Jernite, and Emma Strubell. 2024. Power Hungry Processing: Watts Driving the Cost of AI Deployment? In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24). Association for Computing Machinery, New York, NY, USA, 85–99. https://doi.org/10.1145/3630106.3658542
[2] Utz, Vanessa, and Steve DiPaola. "Climate implications of diffusion-based generative visual ai systems and their mass adoption." Proceedings of the 14th International Conference on Computational Creativity. Retrieved September. Vol. 12. 2023.
[3] Negar Alizadeh and Fernando Castor. 2024. Green AI: a Preliminary Empirical Study on Energy Consumption in DL Models Across Different Runtime Infrastructures. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering - Software Engineering for AI (CAIN '24). Association for Computing Machinery, New York, NY, USA, 134–139. https://doi.org/10.1145/3644815.3644967
[4] Jon Froehlich, Leah Findlater, and James Landay. 2010. The design of eco-feedback technology. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '10). Association for Computing Machinery, New York, NY, USA, 1999–2008. https://doi.org/10.1145/1753326.1753629
[5] Sanguinetti, Angela, Kelsea Dombrovski, and Suhaila Sikand. "Information, timing, and display: A design-behavior framework for improving the effectiveness of eco-feedback." Energy Research & Social Science 39 (2018): 55-68. https://doi.org/10.1016/j.erss.2017.10.001