The remarkable capacities of AI-infused and data intensive systems are regularly presented as emblematic of technoscientific innovation and progress. Far less recognised is the human labour required to sift through and sort data, and ultimately train these systems. This is a labour distributed well beyond the presumed centres of technological innovation, and spun into and strewn across geographically distributed regions in the "Global South". This workshop will bring together scholars and practitioners interested in and conducting research on the hidden labours that lie behind AI and data intensive systems. Participants will have the opportunity to share perspectives or results of their research and, through highly exploratory engagements and dialogues, set out a critical mode of inquiry and future directions with and for a nascent community. (Read more).
The submision deadline is Wednesday 15 September 2021 (midnight, anywhere on earth).
Submissions are invited for a wide variety of media forms, ranging from two-page position pieces to pictorials or short (Tik-tok-like) videos.
Workshop organisers will dedicate a day to collectively review the submissions, with a target of between 15 to 20 participants. Content will be judged on its relevance to the call and its capacity to provoke discussion and critical inquiry.
Selection will rely on an inclusive model, where we will as organisers especially welcome work that represents a diverse community of scholarship and practice.
Notifications of acceptance will be sent by 24 September 2021.
The workshop will be hosted online using an accessible conferencing system, and held over one day. Timings will be designed to limit the time participants must spend online, both to reduce fatigue and enable a global audience. It's anticipated the workshop will run for up to 4 to 5 hours.
Activities will be designed to engage participants in extended and progressively refined discussions before, during and after the workshop.
Pre-Workshop:— Accepted submissions will be shared with the workshop participants in advance with the expectation that they will be read or viewed ahead of the workshop. We will choose an appropriate platform to host the submissions so that comments and questions can be shared, asynchronously, and preserved after the workshop. As organisers we will use the submissions and subsequent commentaries to produce a range of provocations—short statements, narratives, visual materials, etc. inviting examination or queering of significant themes. These provocations will also be shared in advance of the workshop and allow for a further layering of ideas and reflections from participants.
Dialogue groups:— At the workshop, after brief introductions, much of the time (approx. 2 to 3 hours, including breaks) will be spent in small dialogue groups made up of changing combinations of three to four people. These small groups will work experimentally with the provocations and participants’ commentaries. Groups will iteratively produce various media (notes, sketches, collages, etc.) in a shared workshop notebook (e.g. Notion or Miro) to interject in, reflect and capture views ranging from the specific tasks behind AI and data intensive systems to the wider global structures they operate in.
Collective reflections:— The dialogue groups will be interspersed with two short periods for collective reflection and a longer closing activity where all participants will be able to look across and reflect on the workshop’s outputs. The timings and details of the two synchronous activities (this and above) have been kept loose as we know from experience that exploratory workshops such as this one need to adapt to what is working best for participants.
Post-workshop:— Through the media-rich records produced, discussion and reflection will continue after the workshop. Particular focus here will be on the goals set out below.
Aligned with the background and context for the workshop presented in the introduction, the activities described above will progressively work towards the following goals:
This workshop will bring together scholars and practitioners interested in and conducting research on the hidden labours behind AI and data intensive systems. Attention will be given to the global character of these labours. This will mean examining how the unacknowledged yet still essential work of AI is distributed well beyond the presumed centres of technological innovation, and spun into and strewn across geographically distributed regions in the euphemistically termed “Global South”.
Brought together by this focus and a corresponding mixture of intersecting interests, workshop participants will have the opportunity to share perspectives or results of their research with a nascent community. Topics of concern will include but not be limited to the labours that revolve around data labelling, content moderation, microtasking,platform economies, and more broadly a globalised gig-work, and the concentration of such labour across cities (and sometimes the rural regions) of India, Kenya, Vietnam, Venezuela, the Philippines, and so on.
The sharing of perspectives and materials will:
Stimulated by the shared insights and materials, and this critical mode of inquiry, the workshop will go on to use the format of multiple and intersecting dialogues between participants to discuss and speculate on plausible futures. The dialogues and attendant speculative futures will concentrate on re-imagining the uneven geographies of technoscientific innovation. They will seek to build on rather than obfuscate the global supply chains of data, knowledge and skill that have enabled a thriving tech sector in the Global North (see [7] for one such example). And they will project design and policy interventions for more just and equitable practices for the global labourers enabling AI and data intensive systems.
The remarkable capacities of AI-infused and data intensive systems are regularly lauded not only by the digerati but in the popular press. Browse Wired or TechCrunch and in minutes you’ll encounter an article crediting the power of AI or the deluge of data for supplying the ‘hot sauce’ in the next big social networking startup, surpassing some milestone in driverless transport or meeting the pressing challenges faced in health screening.
Far less recognised, beyond some notable exceptions (e.g. [3, 5, 7, 10]), is the human labour required to sift through and sort data, and ultimately train these systems. Crowdsourcing platforms, worldwide, employ thousands of workers to read texts, view images and video, and label data to produce the models AI systems rely on. A straightforward example here is the labelling of faces in photographs. Although many reports of computer vision systems tout the incredible performance of algorithms that identify faces, what is rarely given attention is the work involved in labelling data–required to train and refine computer vision models. Complicating such labours are the norms that are imposed on labelers. This enforcement in the homogeneity of norms–used, for example, in the normalisation of image classification–mask frictions between multiple layers of meaning and, often, culturally sensitive value systems [8, 11].
Those workers sifting, sorting and labelling data can encounter further complexities and ambiguities when, for example, doing the work of content moderation [4]. Again, technological innovations in content moderation will point to new algorithmic techniques for filtering language and in some cases providing important tools for reducing genuine harms in society, such as identifying graphic or violent imagery, or hate speech. Yet overlooked is a global workforce that supplements such automation, or trains and validates the AI, and, crucially, the impacts on such a workforce. Telling is the only recent reports reaching a wider public of the physical demands and, perhaps more important, psychological damage that accompanies such labours [1].
Altogether, these examples and others prefigure a globalised gig-economy, a structural configuration that concentrates wealth and authority (although not necessarily agency) in the Global North and relies on a little recognised, underpaid, disenfranchised, and largely unregulated workforce in the Global South. This structural agglomeration of actors, software systems and platforms, and flows of data further cements an elitist technoscience—what the anthropologist Anna Tsing refers to as the “globe-crossing capital and commodity chains” of capitalist forms [12, p. 4].
The specifics of the labours that constitute gig-work and the wider political economies that these labours are a part of will form the context for the Global Labours of AI and Data Intensive Systems workshop. Participants will bring accounts of labour or informed perspectives in order to compile a corpus of related and current work in the area.
Past work in CSCW and beyond has drawn attention to precisely these hidden and often exploitative labours [2, 5, 7, 10]. Building on this, our own work investigates a contemporary moment where the global gig-economy is coming to rely on new organisational actors emerging in the Global South [9]. For example, small platform startups like iMerit are creating new models to compete with the likes of Amazon’s Mechanical Turk. Based on a very different business model, iMerit has moved away from sourcing and providing microtasks for its labellers and towards an overtly ethical model that prioritises worker training and developing local expertise (in the case of iMerit, in India).
This changing landscape reveals an evolving and expanding matrix of dependencies, intensifying Tsing’s crossings, chains and connections. It highlights, for example, tensions between, on the one hand, the systemic deskilling of data work and, on the other, the growing need to recognise the value of "AI dataset expertise" [6]. Invited, here, is a re-centering of global data workers as situated and agile experts. The extensive domain and bias sensitivity training they must undergo and the ways they capitalise on existing forms of expertise must be seen as processes through which they constantly reproduce themselves as fast-adapting, flexible and responsive actors that are crucial to wider global structures. Thus, it’s not just a recognition that gig workers are hidden from view or marginalized in innovation and regulatory processes that is needed, but directions are required that acknowledge and reward the attention, discretion and care given to data work.
Altogether, what the fluxes in crossings, chains and connections suggest are the possibility of reconfigurations of the status quo and, in particular, a reworking of capital flows, agency, authority and values. For example, with startups like iMerit, we see the potential for decentering those actors that have hitherto dictated the structural configurations of labours behind AI and data intensive systems and, in turn, thinking more generatively about recognition and reward of the distributed and varied global labours.
It’s this that we will take as a starting point for a critical mode of inquiry in the proposed workshop. The possibility for change will be used to set the conditions for speculating on alternate futures, as well as for intervening in or even refusing prevailing capitalist forms. Whether models like iMerit’s go far enough—-to achieve more just and equitable reconfigurations—-will help to stimulate a line of inquiry. Similarly, the modes of moving forward by, for example, intervening in platform design or advising on corporate or national labour policies, will form the basis for articulating and reflecting on what alternate futures we should be seeking.
[1] Angel Au-Yeung. 2021. At Risk Of Losing Their Jobs, Facebook Content Moderators In Ireland Speak Out Against Working Conditions. Retrieved June 11, 2021 from URL.
[2] Paško Bilić. 2016. Search algorithms, hidden labour and information control. Big Data & Society 3, 1 (2016), 2053951716652159.
[3] Kate Crawford. 2021. The Atlas of AI. Yale University Press.
[4] Tarleton Gillespie. 2020. Content moderation, AI, and the question of scale. Big Data & Society 7, 2 (2020).
[5] Mary L Gray and Siddharth Suri. 2019. Ghost work: How to stop Silicon Valley from building a new global underclass. Eamon Dolan Books.
[6] Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, and Margaret Mitchell. 2021. Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 560–575. URL.
[7] Lilly C. Irani and M. Six Silberman. 2013. Turkopticon: Interrupting Worker Invisibility in Amazon Mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Paris, France) (CHI ’13). Association for Computing Machinery, New York, NY, USA, 611–620. URL.
[8] Milagros Miceli, Tianling Yang, Laurens Naudts, Martin Schuessler, Diana Serbanescu, and Alex Hanna. 2021. Documenting Computer Vision Datasets: An Invitation to Reflexive Data Practices. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 161–172. URL.
[9] Sarayu Natarajan, Kushang Mishra, Suha Mohamed, and Alex S. Taylor. 25 Feb 2021. Just and equitable data labelling: Towards a responsible AI supply chain. Technical Report. Aapti Institute, Bangalore, India. URL.
[10] Noopur Raval and Paul Dourish. 2016. Standing Out from the Crowd: Emotional Labor, Body Labor, and Temporal Labor in Ridesharing. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work amp; Social Computing (San Francisco, California, USA) (CSCW ’16). Association for Computing Machinery, New York, NY, USA, 97–107. URL.
[11] Morgan Klaus Scheuerman, Kandrea Wade, Caitlin Lustig, and Jed R. Brubaker. 2020. How We’ve Taught Algorithms to See Identity: Constructing Race and Gender in Image Databases for Facial Analysis. Proc. ACM Hum.-Comput. Interact. 4, CSCW1, Article 058 (May 2020), 35 pages. URL.
[12] Anna Tsing. 2004. Friction: An Ethnography of Global Connection. Princeton University Press.