Eyes on the Code: Tracking Attention in AI-Assisted IDEs
AI-powered coding tools promise to boost productivity, but how do they affect focus, learning, and developer behaviour in practice?
Methods
This study combines eye-tracking data and retrospective think-aloud interviews to explore how student developers allocate attention when coding with AI assistance. Participants who had between 1-5 years of experience with Python completed two tasks: the Library task, which required using an unfamiliar password-generation library, and the Mystery task, which involved interpreting an opaque function and writing an explanatory comment. By comparing experience levels and task types, this study uncovers patterns in how AI support is used, avoided, or integrated into different workflows. Implications extend to the design of supportive Human-AI interaction systems in both educational and professional coding environments. For this study, we picked a leading AI-first IDE due to personal experience, access, and the suitability of its user interfaces for AOI analysis.
Key Findings
By analysing gaze patterns, saccade transitions, and user reflections, the study reveals how AI prompts reshape attention:
- TASK TYPE SUPPRESSES AI CHAT USE. <1% gaze spent in chat window for Mystery task versus ~19% for Library task (average across all participants). The chat can act as a new search modal for managing and using custom libraries with unfamiliar parameters/syntax. However, some intermediate and expert users still preferred to read through official documentation in a browser. This may suggest that programmers prefer to interpret logic independently (at least for short functions) before turning to AI assistants.
- AI CHAT FRAGMENTS ATTENTION. In the 5-minute Library task, the chat was more frequently used by intermediate programmers; code-chat saccades averaged 153 per intermediate participant v. 37 per expert participant. The chat was most frequently prompted for help implementing the unfamiliar library and debugging console errors, particularly around its installation and parameters.
- INLINE SUGGESTION LENGTH AFFECTS USABILITY. All participants experienced several in-line AI suggestions while reading and writing code. While additional gaze analysis for in-line suggestions is warranted, participants mentioned they tend to ignore multi-line suggestions. While they found single-line suggestions easier to evaluate, many preferred to write comments and code themselves to maintain control.
- EXPERTS USE THE TERMINAL. More experienced programmers recorded an average of 334 chat-terminal saccades v. 106 for intermediate users. Further, nearly zero chat-terminal saccades were recorded across all participants, and intermediate participants favoured running AI-suggested terminal commands in-chat over using the terminal themselves.
The above findings are exploratory due to the small sample size (n=6). This research is on-going but so far suggests that programming expertise may influence visual engagement with code in AI-assisted environments.
Design Implications for AI IDEs
Thanks to the depth of data collected, some early implications can still be gleaned from these findings, which may also inform future research:
- The triggers and content for AI suggestions could be tailored to both user experience level and programming file contents. More experienced users might prefer a less intrusive assistant while novices/intermediates might prefer proactive support.
- Single-line or progressive disclosure patterns could reduce the cognitive load for programmers while writing code. In-line suggestions designed for quick coding were found to be disruptive, particularly for code interpretation and comprehension.
- Agentic use of the terminal could be better integrated with the main terminal window. Some participants used the chat to debug console errors in the terminal but then were switched to an agentic debugging workflow entirely in the chat window that they could not replicate themselves in the main terminal.
Acknowledgements
The research was prompted by a research team at Google for the DORA (DevOps Research and Assessment) community. Their initial question revolved around the use and usefulness of AI IDEs for student developers and they provided mentorship around research methods, implications, and developer tooling. We would like to thank Steve Fadden for inviting us to collaborate with his colleagues and for his continued mentorship throughout the project. Finally, we would also like to thank Professor John Chuang, the instructor for the iSchool's Biosensory Computing class, for providing the biosensors and primary supervision for the gaze-based methods.