Gender Diversity in AI Research
Through an analysis of 1.5M papers from arXiv, this study reviews the evolution of gender diversity across disciplines, countries, and institutions as well as the semantic differences between AI papers with and without female co-authors.
By Kostas Stathoulopoulos, Nesta.
Artificial intelligence (AI) increasingly mediates our social, cultural, economic, and political interactions. From improved medical applications to self-driving cars and smart cities, AI has the potential to transform our digital, physical, and social environments in unprecedented ways and at an unprecedented speed. However, the same technologies can be used for mass surveillance, computational propaganda, and biased, discriminating decision-making. It is generally believed that increasing the diversity of the workforce developing AI systems will reduce the risk that they generate discriminatory and unfair outcomes, thus ensuring that their benefits are more widely shared.
But how diverse is the workforce of the AI sector?
What we did
We conducted a large-scale analysis of gender diversity in AI research using publications from arXiv, a widely-used preprints repository where we have identified AI papers through an expanded keyword analysis and predicted author gender using a name-to-gender inference service. We studied the evolution of gender diversity in various disciplines, countries, and institutions. We also examined the link between female authorship in papers and the citations it receives while we investigated the semantic differences between AI papers with and without female co-authors. Lastly, we interviewed female AI researchers and other important stakeholders in order to interpret our findings and identify policies to improve diversity and inclusion in the AI research workforce.
There is a serious gender diversity crisis in AI research
Only 13.83% of authors in arXiv are women and, in relative terms, the proportion of AI papers co-authored by at least one woman has not improved since the 1990s. These aggregate statistics mask significant differences between research domains. Our analysis shows that the proportion of papers in Machine Learning, Robotics, and other data-related topics with at least one female author has remained stable, around 25%, throughout the time frame of our analysis. This also holds for Informatics where approximately 20% of the papers has a female author. On the contrary, in other quantitative disciplines that are not closely related to Computer Science, the share of papers with female authors has been steadily increasing.
Location and research domain play an important role in gender diversity
Women in the Netherlands, Norway, and Denmark are more likely to publish AI papers while those in Japan and Singapore are less likely to. Moreover, women working in physics, education, computer ethics, and other societal issues, and biology, are more likely to publish work on AI in comparison to those working in computer science or mathematics.
There is a significant gender diversity gap in universities, big tech companies and other research institutions
Apart from the University of Washington, every other academic institution and organisation in our dataset has less than 25% female AI researchers. Regarding some of the big tech, only 11.3% of Google’s employees who have published their AI research on arXiv are women, while the proportion is similar for Microsoft (11.95%) and slightly better for IBM (15.66%).
There are important semantic differences between AI paper with and without a female co-author
When examining the publications in the Machine Learning and Societal topics in the United Kingdom in 2012 and 2015, papers involving at least one female co-author tend to be more semantically similar to each other than with those without any female authors. Moreover, papers with at least one female co-author tend to be more applied and socially aware, with terms such as fairness, human mobility, mental, health, gender and personality being among the most salient ones.
System-wide changes are required in order to reduce the AI gender gap
Our qualitative interviews with key stakeholders suggest that a variety of interventions is needed, such as encouraging women to study and work in AI and Computer Science, creating safe and inclusive spaces that support and promote researchers from underrepresented groups, and communicating more widely the transformative potential of AI in many domains and sectors. All this will require leadership, funding, and changes in organisational cultures and attitudes.
The diversity gap is deeply connected to issues in the education system, intersectional inequalities, workplace practices, and persistent stereotypes around identity. We hope that by expanding the evidence on gender diversity in AI research, provide a baseline against which progress can be tracked and engaging with leading figures in education, AI and policy, we can begin to understand in more detail the social and institutional determinants of gender diversity in AI and identify effective interventions to improve it.
If you have any questions about our work or would like to collaborate with us, contact me at konstantinos [dot] stathoulopoulos [at] nesta [dot] org [dot] uk.
Bio: Kostas Stathoulopoulos is a Principal Researcher (Data Science) at Nesta, UK’s Innovation Foundation, and works at the intersection of data science, economics, and policy making. Kostas uses open data and machine learning to inform innovation policy by providing timely evidence and building tools for policymakers.
Original. Reposted with permission.
- 19 Inspiring Women in AI, Big Data, Data Science, Machine Learning
- Diversity in Data Science: Overview and Strategy
- Resources For Women In Data Science and Machine Learning