MIMS Final Project 2023

Evaluating Algorithmic Fairness in AI Recruiting Solutions

Team members

Advisor

Abstract

As AI-enabled recruiting tools become more prevalent in the hiring process, it is critical to gain a comprehensive understanding of the algorithms that power them. To achieve this, we conducted a mixed-methods study in two parts. In Study 1, we conducted semi-structured interviews with current technical recruiters and recruiting managers, which gave us a more nuanced understanding of their attitudes towards what they consider to be a "typical" software engineer. These interviews also highlighted the significance of candidate search tools in determining hiring outcomes. In Study 2, we conducted an algorithmic audit of an industry-leading recruitment software. We analyzed the demographics of candidates in the tool's search outputs based on factors such as race, gender, education, location, and company. We discovered that the recruitment software's algorithms seem to favor candidates with a conventional education and career path, as well as those working for FAANG and FAANG equivalent companies. Taken together, these studies suggest a feedback loop between recruiter beliefs about candidate typicality and the search outputs of AI-enabled recruiting tools. These results have implications for diversity and inclusion in the tech industry, and challenge claims that AI can promote inclusive recruitment pipelines. This research provides insight into the broader implications of the tech industry's adoption of AI-based recruiting and informs strategies for the future of the recruitment industry.

Background

The tech industry has often struggled to support diverse candidates in a variety of roles, particularly software engineering. A resume study by Quadlin (2018) found that recruiters preferred candidates who applied for jobs that aligned with stereotypes of the candidate’s identity (e.g. male candidates were preferred for computational roles). Relatedly, in a recent interview conducted by our research team, a recruiter stated that “recruiters need stereotypes,” because the sheer volume of hiring prevents them from closely assessing candidates’ individual traits and potential. Given this volume, the recruiting industry has increasingly turned towards AI-enabled recruiting tools. While AI-enabled recruiting holds great potential to increase business efficiency and reduce human error and biases, it also threatens to reify and magnify many of these same biases–biases that are often trenchant within the tech industry.

Recruitment, and the AI-enabled tools used by recruiters, represent a useful lens through which to study biases within the tech industry. First, recruiting shapes the demographics of the workplace. While there are of course many internal factors that influence who succeeds within companies, recruiting determines who is admitted to begin with. Additionally, when a company develops a recruitment process, they in effect distill their core values into a series of skills assessments and interview questions. Therefore, studying recruiting allows us to study the values of the tech industry at large. Finally, from a practical perspective, recruiting is easier to study than other corporate processes. For example, it would be nearly impossible to get access to data related to a company’s internal performance reviews, even though it might tell us a lot about biases in the tech industry. However, many of these companies use third-party recruiting tools and, for a price, a researcher can gain access to the same recruiting tools used by recruiters at any major tech company. For all these reasons, recruitment represents a convenient gateway to study biases within the tech industry.

Our research seeks to evaluate biases within recruiting, and the tech industry at large, through the use of a holistic, mixed-methods approach. While a wealth of past research has established bias within the recruiting process (Bertrand and Mullainathan, 2004; Quadlin, 2018), these studies are primarily focused on the resume review stage, ignoring the processes that determine how the resume is seen by the recruiter in the first place. Similarly, while literature on algorithmic fairness has exploded recently, most of these studies assess the fairness of an algorithm from a purely computational perspective (Geyik, Ambler, and Kenthapadi, 2018), ignoring the systems and processes within which such algorithms are deployed. In our studies, we employ methodologies borrowed from Human-Computer Interaction (HCI) to build a detailed technical understanding of the AI-enabled recruitment tools being used, as well as the recruitment attitudes and processes within which such tools are deployed. Study 1 uses qualitative research methods such as semi-structured interviews to build an understanding of recruiter’s attitudes about candidates, their recruiting practices, and search processes. In Study 2, we combine elements of human factors research and algorithmic audit in a study that evaluates the outcomes of these processes in an existing candidate search interface.

Study 1

Study 1 offers insight into the recruiting processes that contribute to the limited diversity seen in the tech industry. Through interviews with tech industry recruiting professionals, we build an understanding of recruiter’s attitudes about candidates, their recruiting practices, and the search processes they engage in to identify candidates. A number of key themes emerged from these interviews. First, we found that recruiters appear to have a collective definition of the resume markers that constitute the “prototypical” software engineer, including a Bachelor’s or Master’s in Computer Science and a current Software Developer title. Second, “prestigious” resume markers (such as employment at a FAANG company) can increase a candidate’s desirability. Finally, we observed the impact of recruiting tools in determining recruitment decisions, and the ways that these tools are actually used to train and influence recruiter’s beliefs about candidate typicality. While these interviews did not directly tell us how recruiters’ attitudes or processes might impact candidates along the axes of race and gender, they do provide a framework for a more targeted assessment of certain components of the recruitment process.

Study 2

Study 2 builds upon the results of Study 1, using methods from HCI and algorithmic auditing to assess the search outputs of the popular AI-enabled recruiting tool. By analyzing the results of common recruiting searches, we sought to build an understanding of the impact that these search patterns have on the recruitment pipeline, the key resume markers that the recruiting tool uses to identify candidates and rank their suitability for roles, and whether these resume markers may lead to unfair outcomes for candidates along the dimensions of race and gender. Through this research we uncovered markers that the recruiting tool indexes highly on to establish relevance. We also discovered that FAANG companies constitute a disproportionate percentage of search results, that there is a lack of racial and gender diversity in search results, and that there are high rates of candidate duplication, especially among female and Black software engineers. While these results did not provide clarity on the exact way that the recruiting tool ranks candidates, these results do provide valuable insight into how recruiters’ use of AI sourcing tools may impact the diversity of candidates in their pipeline, subsequently shaping the demographics of the tech industry as a whole.

More Information

Paper

Presentation

Last updated: May 13, 2023