Original art by Paweł Kuczyński.

Moving fast and breaking us all: Big Techs unaccountable algorithms

By Ellery Roberts Biddle & Jie Zhang

They decide who passes and who fails in secondary school. They decide who gets arrested and who goes to prison. They decide what news you see first thing in the morning as well as what news you won’t see. And they drive the business models—and revenues—of the world’s largest and most powerful digital platforms.

In spite of all this, many of the world’s most powerful algorithms are accountable to no one—not even the companies that build and deploy them. Some have made vague pledges to “be ethical,” and for all we know, there may be strong policies or rules that the companies follow behind closed doors. But the overall lack of public explanation of how these systems are built and run indicates that companies do not have oversight over how their own systems work. In light of the enormous effects that they have on human rights, public health, public safety, democracy, and our understanding of reality, this is nothing short of reckless.

For the 2020 RDR Index, we looked for companies’ answers to some fundamental questions about algorithms: How do you build and train them? What do they do? What standards guide these processes?

We combed the public-facing documentation and, to no surprise, found very, very little. Yet companies are harvesting user data by the minute, to fuel algorithmic optimization, engagement, and personalization—all things that translate to enormous profits.

...our findings suggest that much of the technology driving revenue for the world’s most powerful digital platforms is accountable to no one—not even the companies themselves.

In the absence of this information, all we have is a set of clues. Companies offer small hints at how their systems work, both in their policies and in public statements. Other information has bubbled up through investigative journalism efforts and technical research. With our findings from a new set of indicators that evaluate company disclosures on algorithmic systems and targeted advertising, we seek to contribute to these efforts, by putting what we have learned into the broader context of what is publicly known about the algorithmic systems of digital platforms.

Why set human rights standards for algorithms?

When we expanded our methodology, we drew on recommendations from nearly 100 digital rights experts around the world, along with intergovernmental entities including the Council of Europe and the UN Special Rapporteur on the right to freedom of opinion and expression.

A majority of these stakeholders have found, and we agree, that it is imperative to apply an international human rights framework to the development and deployment of algorithms. A human rights framework would not just set forth standards for how to “do no harm” or “be ethical,” but it would help hold companies accountable for those standards, by providing mechanisms for risk assessment, enforcement, redress when harm has occurred, and individual empowerment for technology users.

First and foremost, we expect companies to establish a policy or set of requirements aimed at protecting users’ rights to freedom of expression, privacy, and non-discrimination in both the development and use of what we term “algorithmic systems,” which use algorithms, machine learning, and other technologies to automate, optimize, and/or personalize use of their platforms. We see a policy like this as paving the way for companies to make their algorithmic systems transparent and accountable to the public. While two of the telcos (Telefónica and Vodafone) we rank published such commitments, none of the digital platforms in the RDR Index have done this.

From there, our approach is twofold. First, we ask how companies build and train their algorithms, practices often rooted in data collection and data inference, whereby companies use data they have to predict users’ behaviors or attributes. Then, we look at how companies use algorithms to moderate and promote content, practices that have big implications for freedom of expression and information.

Does the company disclose a commitment to human rights in its development and use of algorithmic systems?

How do they feed their algorithms?

We know that companies collect and classify people’s personal and behavioral data at a massive scale in order to fuel their algorithmic systems. Some companies indicate as much in their privacy policies.

We all know from experience that this data is used to determine what we see online, in everything from search results, to news feed items, to ads that target us to a creepily precise degree. This can range from the mundane ( ads for a store where you recently made a purchase) to the absurd (ads for baby products before you’ve told anyone that you’re pregnant) to the dangerous (ads that tell you the wrong polling date for a major election.)

Yet only three digital platforms in our set—Alibaba, Apple, and Baidu—informed users that their information is used to develop or train algorithmic systems. Apple went a bit further, making a vague promise to provide users with a “means to consent and control such data use,” but we found no direct evidence of when or how this actually happens.

Original art by Paweł Kuczyński

How do the algorithms work?

No digital platform in the RDR Index offered anything close to an adequate explanation of how its algorithms work. A few published simplistic summaries of how specific systems work, typically in response to public pressure.

Three search engines, Google Search, Microsoft’s Bing, and Yandex Search published high-level explanations of how their search algorithms work, describing the major parameters in search ranking, such as relevance of web pages, quality of content, context, and settings.

These descriptions hardly account for the evidence of bias or interference of various kinds that researchers have exposed over the years.[1] In April 2020, Russian internet users suddenly began seeing “overwhelmingly negative” results when seeking information about opposition politician Alexei Navalny on Yandex’s search engine. In a subsequent public statement, the company explained that the change was due to an experimental feature that engineers had tested and later removed from the platform. But this explanation came only in response to public pressure. Countless other changes or instances of bias might occur, but they typically come to light only when they connect with major issues or public figures, as in the case of Navalny.

Nine of the 14 digital platforms we evaluated provided some information about whether they used algorithmic systems to curate, recommend, or rank content displayed to users. All were vague about how the algorithms were deployed for these purposes.

Although Google has made many references to the algorithms that curate and recommend content on YouTube, it failed to publish an operational-level policy on its algorithmic content curation systems, in contrast to some of its U.S. peers. This is troubling, given what independent researchers have unearthed about YouTube’s propensity for recommending “extreme” content in an effort to increase users’ time on the site. A search for political information can lead surprisingly quickly to violent extremist content. And people searching for child pornography, which is illegal and banned on the site, have nevertheless found their way to videos of children at the beach.

Data from Indicator F12 in the 2020 RDR Index

Facebook and Twitter each published pages with generalized descriptions of how their most popular features (News Feed and Timeline, respectively) decide what to display and what factors influence their ranking, like recency, presence of photos or video, and user interactions, but this information was fragmented at best and not easy to locate.

This is especially concerning in the case of Facebook, which has myriad features across its services that rely on algorithms, such as the recommendation algorithm, the controversial “people you may know” feature, and various algorithms used to filter or censor material that violates the company’s rules.

Among the companies we rank, Facebook has been the source of some of the more disturbing real-life harms that algorithmic systems can trigger. But the company offers the public little actionable information about how these algorithmic systems are built, how they operate, or how the company monitors them. Two recent examples illustrate the real-life consequences of allowing these systems to operate without meaningful oversight or accountability.

Facebook’s recommendation algorithm, which offers users suggestions on groups they might want to join, has attracted concern due to its propensity for driving users to groups with extremist ideologies. Soon after the January 2021 attack on the U.S. Capitol, evidence emerged that some of the attackers had connected with each other on Facebook groups of this nature.

Months before, in October 2020, Facebook CEO Mark Zuckerberg told the U.S. Congress that the company had stopped recommending all “political content or social issue groups,” a clear sign that the company recognized a problem with the algorithm.[2] But recent research by The Markup, an investigative journalism nonprofit, showed that the platform in fact has continued making these kinds of recommendations. When reporters contacted the company about its findings, a Facebook staffer replied that staff were “investigating why these [extremist groups] were recommended in the first place.”

Evidence like this casts a long shadow over the promises that companies make in their policies, especially when the enforcement of those policies is left up to algorithmic systems.

But even when a company does seem to have control over its technology, there is plenty of margin for error that can lead to real-life harm. This is especially evident with algorithmic systems that proactively identify and censor content that violates content rules, a practice that we know results in untold quantities of posts and accounts being restricted in error.

Demonstrators in Lagos, Nigeria, in October 2020. Photo by Kaizenify via Wikimedia Commons (CC BY-SA 4.0)

Another example of these problems in action comes from October 2020, when Nigerians took to the streets calling for an end to the country’s Federal Special Anti-Robbery Squad (known locally as SARS), a law enforcement agency notorious for corruption and abuse. When protesters were met with a violent response from the Nigerian army, social media lit up with graphic photos and videos of demonstrators being shot and killed, typically accompanied by the hashtag #EndSARS.

Instagram (which is owned by Facebook) began flagging #EndSARS posts as “false” and reducing their distribution, due to what a Facebook employee later explained was a system error that confused the hashtag with false information related to the coronavirus, also known as SARS CoV-2.

When University of Pretoria scholar Tomiwa Ilori inquired about the error, a Facebook employee explained: “In this situation, there was a post with a doctored image about the SARS virus that was debunked...then our systems began fanning out to auto-match to other images….This is why the system error accidentally matched some of the #EndSARS posts as misinformation.”

The system error resulted in the flagging and potential censorship of untold quantities of posts containing not only protest messages, but also photo and video evidence of human rights violations carried out by the Nigerian army.

Reflecting on the gravity of Instagram’s error in a piece for Slate, Ilori wrote: “Social media companies cannot continue to wish away their responsibility to protect human rights.”

Do companies measure the risks that their systems pose for human rights?

Alongside notorious cases from India, Myanmar, Sri Lanka, and beyond, the examples above demonstrate the urgent need for companies to assess the human rights risks that their platforms can pose for users and the public at large.

When we looked for evidence of companies’ efforts to mitigate harm by assessing their algorithmic systems, the results were disappointing once again. Most companies in the RDR Index did not take such steps, and those that did still leave much to be desired.

Data from Indicator G4d in the 2020 RDR Index

Apple, Facebook, Verizon Media (owner of Yahoo Mail), and Microsoft conducted human rights impact assessments to see if their algorithmic systems might cause discrimination or violate people’s privacy. But most of these companies offered little additional information, failing to describe what these processes actually entail and what groups or contexts they consider when conducting such assessments. For example, in its Annual Human Rights Report for 2019, Microsoft stated that it “began a major forward looking Human Rights Impact Assessment (HRIA) at the start of FY17 into Microsoft’s growing portfolio and expertise in artificial intelligence (AI),” but offered no further detail on the impact assessment.

“Social media companies cannot continue to wish away their responsibility to protect human rights.”

- Tomiwa Ilori, University of Pretoria

Facebook was the only company in the RDR Index that conducted a human rights impact assessment on its ad-targeting practices, but its assessment was limited in scope. The assessment came after the company’s algorithmically driven ad-targeting systems were called to task as part of an external Civil Rights Audit that Facebook commissioned after groups in the U.S. brought a civil suit against it for enabling routine violations of the Fair Housing Act. This law prohibits real estate entities from discriminating against prospective renters or buyers on the basis of their race, ethnicity, or other identity traits. The lawsuit came after investigative reporting and technical testing by ProPublica showed that Facebook’s systems effectively enabled such practices by allowing advertisers to choose which “ethnic affinity groups” they wanted to target and which ones they wanted to exclude.

The audit touched on the discriminatory impacts of Facebook’s algorithmic recommendation and ad-targeting systems, but emphasized that the company’s efforts to mitigate these impacts were “nascent” and that the auditors were “not given full access to the full details of these programs.”[3] The audit also was limited in scope, covering only the U.S. market, and the company appears to only have made adjustments to align its systems with U.S. anti-discrimination laws. But Facebook is a global company. The root causes of the problems that the audit assessed are by no means limited to the U.S.

Ethics commitments are not going to solve these problems

While they have become a popular answer to questions about harm incurred by AI (especially in Silicon Valley), ethics commitments have proven to be a toothless tiger, showing little evidence that they actually help to protect users’ rights.

In contrast to international human rights doctrine, which offers a widely ratified and robust legal framework to guide the development and use of these technologies, ethics initiatives are neither legally binding nor enforceable. They also tend to be normative, driven by the context in which they are created. As ARTICLE 19’s Vidushi Marda put it, they have “become a smokescreen for ‘doing the right thing,’ even when there is no clear understanding of what ‘the right thing’ is, or how to measure it.”

Data for Indicator G1, Element 3 of the 2020 RDR Index

Our results illustrate Marda’s point. Of the digital platforms we ranked, six are members of the Partnership on AI and four published some type of AI principles or ethics commitment. None of the platforms published policies demonstrating an effort to integrate respect for human rights into their deployment of algorithms for their products and services, where it actually counts for users.

“Like walking through a dark forest”: Testing, inferring, and imagining

While the movement to force algorithms used by public agencies into the light has made great strides in recent years,[4] advocates have made little progress when it comes to digital platforms like the ones we rank.

In the absence of regulation that would force companies to make their systems transparent and accountable to the public, there is a growing class of researchers, journalists, and artists who are taking matters into their own hands and finding ways to show how corporate algorithms work—or at least to imagine how they might work. For example, New York University’s Ad Observatory project collects technical data that sheds light on how ads are targeted on Facebook, and offers a publicly accessible archive of selected ads, with a focus on political issues.

Citing a terms of service violation, Facebook sent a cease-and-desist letter to the leaders of the project in October 2020, shortly before the U.S. general election. Speaking about the effects of Facebook’s cease-and-desist letter, Damon McCoy, a computer scientist co-leading the project, said:

“Facebook’s algorithm...has enabled certain Facebook advertisers to profile citizens and send them misinformation about candidates and policies that are designed to influence or even suppress their vote. Shutting down a key data source for studying election interference and manipulation—in November, of all months—impedes our efforts to safeguard the democratic process.”

Others are using technical and social science research to interpret what they know and to articulate a vision of what might lie behind the screen. One famous example of such efforts is tech scholars Kate Crawford and Vladan Joler’s “Anatomy of an AI System,” a visualization of an artificial intelligence system which now sits in the Museum of Modern Art in New York City. In an earlier work, Joler used real data and technical sources to build a detailed visualization that interprets how Facebook’s algorithms likely function.

A subsection of Joler’s “Facebook Algorithmic Factory” (CC BY-NC-SA)

In a recent conversation with us, Joler explained that his aim was to create a map, something that could give us a picture of what goes on inside Facebook’s systems. “It is like walking through a dark forest with a torch,” he said of the visualization. “This is based on real data. But we don’t know how much we don’t know.”

“Even if they gave us all the mathematical functions that they use, how does this function influence society? That’s really hard to understand,” Joler told us.

This is what we have right now. Absent regulation, it will be up to the companies to show us their work. Until that happens, we must keep the torch lit.

Footnotes

[1] Harvard professor Latanya Sweeney’s seminal study on Google search results for racially-associated names showed that when one searched Black-identifying names, the prevalence of ads related to criminal activity, such as background check services, was significantly higher than in searches of white-identifying names, regardless of whether or not named individuals had criminal records.

Sweeney, Latanya. “Discrimination in Online Ad Delivery,” Harvard University, January 2013.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2208240

An investigation by the Wall Street Journal showed some of the effects that Google’s algorithms have on search results, including favoring big businesses over smaller ones.

Kristen Grind, Sam Schechner, Robert McMillan, John West. “How Google Interferes With Its Search Algorithms and Changes Your Results,” Wall Street Journal, November 15, 2019. https://www.wsj.com/articles/how-google-interferes-with-its-search-algorithms-and-changes-your-results-11573823753

[2]A 2016 study carried out by researchers at Facebook, and focusing on users in Germany, showed that 64 percent of the time, when people joined an extremist Facebook Group, they did so because the platform recommended it.

[3] See p81 of “Facebook’s Civil Rights Audit - Final Report,” July 8, 2020, https://about.fb.com/wp-content/uploads/2020/07/Civil-Rights-Audit-Final-Report.pdf

[4] Lawyers at the U.K. non-profit firm Foxglove have used strategic litigation to compel public agencies to release critical information about the algorithmic systems they use for things like deciding students’ grades and assessing visa applications.

“Home Office Drops 'Racist' Algorithm from Visa Decisions,” BBC News, August 4, 2020, https://www.bbc.com/news/technology-53650758

Support Ranking Digital Rights!

Tech companies wield unprecedented power in the digital age. Ranking Digital Rights helps hold them accountable for their obligations to protect and respect their users’ rights.

As a nonprofit initiative that receives no corporate funding, we need your support. Help us guarantee future editions of the RDR Index by making a donation. Do your part to help keep tech power in check!

Donate
Read more:
Key findings

Companies are improving in principle, but failing in practice

Recommendations

Our guidance for companies and policymakers committed to protecting and promoting human rights online

China’s tech giants

China’s biggest tech companies have proven they can change. But the state is still their number one stakeholder.