Advancing Data Justice Research and Practice

By Digital Empowerment Foundation

Published on: Jan 16, 2023

This is one of the narratives of the communities impacted from data injustice which are published in the research report on data justice by Digital Empowerment Foundation

The injustices that algorithms of platform and gig-economy apps cause has been documented previously. In India, the workers in the gig-economy are counted as “clients,” depriving them of many protections labour laws provide. In such an unorganised sector, Shaik Salauddin of the Indian Federation Of App Based Transport Workers (IFAT) is one of the leaders organising and unionising people working in ride-hailing and delivery apps. We speak to him in detail about the algorithms that cause injustices.

In December 2019, the Indian Parliament passed the controversial Citizenship Amendment Bill, along with the government’s commitment to enforce a National Register of Citizenship. As Booker Prize winning author and activist Arundahti Roy put it, “Coupled with the Citizenship Amendment Bill, the National Register of Citizenship is India’s version of Germany’s 1935 Nuremberg Laws, by which German citizenship was restricted to only those who had been granted citizenship papers—legacy papers—by the government of the Third Reich. The amendment against Muslims is the first such amendment.” Noting the use of an automated tool to decide the lineage of people in Assam, we spoke to Abdul Kalam Azad , a researcher from Assam, now at Vrije Universiteit Amsterdam, who had looked into detail the issues and exclusions created by the NRC in Assam. Learning of exclusions of Trans People from the same list, (already facing an undemocratic law like the Trans Act), we spoke to two activists from the Trans Community, Sai Bourothu, who had worked with the Queer Incarceration Project and the Automated Decision Research team of The Campaign to Stop Killer Robots, and Karthik Bittu, a professor of Neuroscience at Ashoka University, Delhi and an activist who had worked with the Telangana Hijra, Intersex and Transgender Samiti.

Another exclusion we noted in our primary research was the homeless in any of the data enumerations. We spoke to Jatin Sharma and Gufran, who is part of the Homeless Shelter in Yamuna Ghat on these exclusions and how it leads to the homeless people being denied basic healthcare and life-saving TB treatment.

Four researchers, activists and civil society leaders who had done considerable work on data related exclusions, surveillance, and identification software such as the Aadhar offered their perspectives on the debates, conversations and potential reimaginings of data injustices. Srinivas Kodali, independent activist and researcher; Nikhil Dey, of the Mazdoor Kisan Shakti Sangathan; Apar Gupta, lawyer and director of the Internet Freedom Foundation, and Rakshita Swamy, an NLU professor who also heads the Social Accountability Forum for Action and Research were the people who provided their insights.

Erased Identities: Trans Community answers to Data Justice

Something that was pointed out to us was how institutional governance never actually recognized ‘transgender’ as an entity or as a biometric marker for any individual to have until 2014. As Sai told us, the queer and trans community has “been forcefully disappeared for 70 years of independence.” As data is the primary marker for public policies and public welfare, the Trans community has been largely invisibilized, gentrified or ghettoised in the past 70-80 years of India’s nation-building process. She explained how national statistical systems such as the National Sample Survey Organisation (NSSO) contributed to this. One key example, quoting Sai: “in the 2011 national census that captured approximately 4,11,000 trans persons exist in India. But these numbers were grossly underrepresented because even in small areas, there exist community groups. When this data was shared for cross verification, and they were divided district wise, it came out that there is very clear evidence from community estimation that there is an average of 200 people and yet the census data pointed out that there is an average of 4 transgender persons in one district. So there has been a clear disparity in how the community is represented. What is even more difficult is that this is going to inform policy. If a welfare scheme even tomorrow were to come up which determines some kind of aid for trans persons, it’s going to grant that aid with the assumption that [there are only] 4 lakh in the entire country. [It] does not take into cognizance the fact that there might be so many more who have not yet been recognized or who have not been able to go through the governmental red-tapism yet to identify as such in some places.”

Surveillance systems, like the model one being set up in Hyderabad 9 creates another issue for the already invisibilized and ghettoised trans-community. Being historically criminalised, and therefore having criminal records for most acts of survival like begging or sex work, any large scale predictive policing is going to be balanced against the community. Whatever existing redressal mechanisms isolate the issue into making it the responsibility of the individual “to have the knowledge of the harm [they’re] going through, and then necessitated to also be able to understand the process of seeking redressal.” Some of these are institutional, but some need not be intentional. If the codes were publicly available, and more people from the disadvantaged communities know and can check this code, “inclusive coding to make sure that discrimination and bias are kept in check to a certain extent through their own efforts of reviewing these codes.” This effectively highlights the power and participation pillars of data justice.

One of our interactions was with a science professor at a leading university in Delhi, a trans-man who had worked actively with the community on several issues. He spoke on the NRC exclusion of trans-people, and also of ML in science to broadly give an understanding of human biases. Trans-people are excluded from a combination of either missing documents because they fled abusive homes when they were young, or documents that were inconsistent”. Around 2,000 trans people were excluded as a result of this, and a legal battle is ongoing. Other algorithmic exclusions that happened in the country were instances of applications to institutions, where trans-people’s names were misidentified as referring to two separate people with two separate names- and then summarily rejected. The same respondent also explains how ML tools work in some of the other projects he is working and collaborating in. As he explained, the ‘science’ of personality research has a long classist and racist history- a pseudoscience where workers are analysed and decided which role to be given based on personalities. When these ML tools were fed with datasets from classical psychology, their research has shown how the program does not provide a justifiable cut-off for saying one of these categories of personalities are more valuable than another. This helps debunk the previously held theory on the psychology of personalities.

Another aspect in relation to algorithmic injustice is how human understanding is also based on certain algorithms, and how these algorithms are also fundamentally flawed and riddled with various confirmation biases. “Human algorithms work like what we call a Bayesian learning algorithm. We see priors in how the world works, and we continue to think the world continues to work that way.” AI tools can be used to show that when one feeds in datasets that don’t have a bias, it shows that several things or patterns (that human beings with their cognitive biases assumed existed) do not actually exist. Race, similarly, is shown as “an arbitrary category consisting of looking at specific combinations of superficial” factors like skin or hair; when all genes are considered together, there is no consistent difference between racial categories. In this way, ML tools can challenge existing notions of power structures. Taking an example of cancer biopsies done by ML tools, the more data fed into the system can make diagnosis faster and more efficient. Of course, this has to be seen together with what the AI developers feel/need to be conscious of about working across the stack and considering other social factors as stated in examples of baby-weighing and TB samples, but unbiased, centralised, anonymised records of all patients can be one such workaround in the design.


Subscribe to our newsletter

Copyright © 2002 - 2026 Digital Empowerment Foundation