Introducing RDR’s Preliminary Standards for Generative AI

By Zak Rogoff (Lead Author), Sophia Crabbe-Field, and Anna Lee Nabors


Update:
RDR is seeking feedback on draft indicators based on the preliminary standards.

Since the fall of 2022, we’ve witnessed the unprecedented development of generative AI, systems that can turn a simple text prompt into high-quality prose, images, synthetic voices, code, and more. Policymakers across the globe have been scrambling to address the emerging risks created by generative AI, with widespread discussions ongoing about how to regulate it. While tech executives have noted the “existential” threat of AI to future humans, human rights activists have largely viewed this as a distraction and pointed instead to the imminent threat of this technology in exacerbating current harms, including discrimination and deepening inequalities.

The U.S. Congress has recently unveiled an AI “SAFE Innovation Framework,” which has yet to be translated into proposed legislation. The White House and Senate leadership have also demonstrated preliminary interest in regulating these new technologies by holding congressional hearings on the subject (and creating some concern about the apparent coziness of some lawmakers with OpenAI CEO Sam Altman).

Meanwhile, the EU’s draft AI Act has been the most visible and comprehensive example of attempts to rapidly address the emergence of this new technology legislatively. Though the AI Act has been in the works since 2021, the bill was recently amended to respond to the release of generative AI. Yet experts have noted that the law, which is not expected to come into force for at least two years, has many limitations stemming from its risk-based approach. The most stringent rules proposed so far have instead come from China’s draft regulations, released in April 2023, which ban systems from outputting inaccurate information and provide strong data and copyright protections. Meanwhile, many have stressed the utility of existing privacy legislation with regards to AI, including Europe’s General Data Protection Regulation (GDPR).

Generative AI presents many potential new risks, like empowering influence operations, generating non-consensual pornography, and unleashing a potentially ungovernable torrent of ultra-targeted ads. Yet it may also worsen many existing harms, from its potential for spreading biased and incorrect information at even greater scale to compounding consent issues, in this case around the use of personal information for training data. It is clear that generative AI presents a new scale of risk from the ICT sector for which novel standards are necessary if RDR is to continue holding both tech and telco giants accountable for human rights.

Download RDR’s preliminary standards here.

Overview of RDR’s New Generative AI Standards

Why we’re focusing on consumer generative AI services, not foundation models

Consumer generative AI services, from ChatGPT to Microsoft’s Bing to MidJourney’s image generator, all rely on foundation models, massive and costly AI systems, trained on data sets of huge and novel scale, which can be queried by services on behalf of users. OpenAI, for example, provides both a service and a foundation model. ChatGPT is a consumer-facing AI service, based on the company’s GPT-3.5 foundation model (for the free plan), while Microsoft’s Bing relies on OpenAI’s GPT-4 foundation model. Companies that run services powered by another company’s foundation model are often called “deployers.” 

Each service chooses which foundation model to rely on, and may make significant extensions to the behavior of the model. For example, to create its ChatGPT consumer generative AI service, OpenAI extended its own GPT-3.5 foundation model by performing additional specialized training, called fine-tuning, that allows it to converse more naturally with users. Bing extended OpenAI’s GPT-4 model by adding real-time access to the web through a search engine. A debate is currently ongoing about the appropriate distribution of risks and responsibilities between the companies that provide foundation models and those which provide services powered by them, including with respect to the EU AI Act. 

So far, prominent research projects have focused on foundation models. For example, Stanford University’s analyzed foundation models’ compliance with the draft AI Act. However, RDR’s mission focuses on the direct experience of users, who generally do not interact directly with foundation models. We believe that these are the first civil society-created standards, based in human rights, which consider the distribution of responsibility between consumer generative AI service providers and third-party providers of foundation models. They center the interaction between the user and the service, while acknowledging the necessity for the service provider to perform due diligence on the foundation model they choose to use.

The preliminary standards cover five categories:

1) Model

2) Accountability

3) Policy enforcement

4) User privacy

5) Security

The Model category holds the majority of the newly created generative AI-specific standards. This category expects companies to alert users that they are interacting with a machine and explain the extent to which the information or depictions it provides are likely to reflect reality. They require clear disclosures about the various tools used to develop the service, including the foundation model employed and the methods used to extend it. Extensions could include additional training, in which case service providers are expected to explain how they got consent from the people whose data was used. The standards further require the service provider to explain the business model the generative AI is designed to serve, for example, whether it seeks to persuade people to make purchases or direct them to certain sites.

To push service providers to fulfill their commitments, one standard calls for companies to carry out internal assessments and qualified third-party audits, to test the service’s behavior on criteria including bias, security, and the effectiveness of content policy enforcement, whether automated or manual. Automated policy enforcement includes technical safeguards that prevent a model from outputting hateful or illegal content. The service provider is also responsible for providing explanations of how the system works, both at an introductory level for users and at a more technical level for experts.

Standards in the Policy Enforcement category call for companies to set clear and transparent rules regarding the content and behavior it prohibits, and their enforcement, such as mechanisms preventing a system from generating hate speech. Companies are also expected to offer an appeals process when they restrict a users’ account. Further, the standards call for a new practice of producing numerical reports of policy enforcement actions in generative AI services, analogous to content moderation reports published by social media platforms.

The Accountability, User Privacy, and Security categories borrow heavily from the existing set of indicators RDR uses in our annual scorecards on big tech and telecommunications companies, with special adjustments for generative AI. They expect companies to proactively manage risks to privacy, freedom of expression and information, and freedom from discrimination, through robust governance structures. Companies must also take digital security seriously, and commit not to punish security researchers who identify issues with their product. The standards further expect companies to clearly explain how they use user information, and commit to minimize it. Lastly, they call for companies to provide mitigation or remedy for human rights harms caused to their users, including from data breaches.

Across these standards, specialized clauses identify expectations of the service provider with regard to the foundation model it chooses to use. These include providing the user with summaries or links to the foundation model provider’s transparency information, performing due diligence on the foundation model, and working on behalf of users if they need to interact with the foundation model provider to have their rights fulfilled. For example, the standards expect the service provider to help its users get their data removed from the training set used by the foundation model it uses (though a time delay may be justified). For flagship services provided by the same company as the foundation model it uses, these same obligations exist but may be met through different types of disclosures and policies.

Check out the full list of standards:

Download our PDF

 

Q & A

Why has RDR decided to create new standards for generative AI?

Consistent with RDR’s long history of holding tech and telecommunications companies accountable to users through robust human rights and transparency standards, which are responsive to technological changes and emerging technology, RDR is proposing a preliminary list of public-facing disclosures and policies we think all companies deploying consumer generative AI must adhere to if we are to ensure protection for users and the general public from AI’s many known and potential harms. We believe these will spur further important discussions about how to govern generative AI, helping to move us toward voluntary industry adoption of best practices, as well as stronger legislation.

What additional and exacerbated harms does RDR hope to address in its new standards?

These standards can be applied in many ways to address potential harms of generative AI, including some of the following examples. By enabling users to directly interact with AI-generated content that has the appearance of credibility, these systems run the risk of “turbocharged information manipulation” by, for example, facilitating fraud and scams at mass scale when used by bad actors. Our standards call for these companies to take responsibility for ensuring their users are informed about this possibility. Meanwhile, we also ask companies to address the risk of users implementing this technology in order to intentionally spread harmful disinformation.

The Historical Figures app, a service powered by OpenAI’s GPT-3 foundation model, allows users to communicate with reimagined historical figures ranging from Steve Jobs to Charles Manson to Hitler. The result is alarming for a myriad of reasons. This includes the revised history these figures repeatedly spew and the chatbots’ willingness to spread hateful ideology. Our standards would address these harms by calling for companies to preemptively evaluate such risks through comprehensive human rights impact assessments and by establishing mechanisms through which users can flag problematic content when they encounter it. Companies should also clearly disclose what kind of prompts from users are and aren’t allowed and what measures will be taken by the company (including, for example, reporting or restricting a user’s account) to enforce these policies.

Increasing the incentive to collect as much data as possible, through the scraping that’s required to train such algorithms, poses a significant additional threat to people’s privacy online. Many jurisdictions, including the U.S., currently lack comprehensive privacy protections on a national level, including data minimization requirements. Our standards ask companies to provide users with the opportunity to remove any data about them that has been used to train a model without their express consent.

What kind of companies do these preliminary standards apply to?

They apply to any consumer-facing generative AI product which generates static images and/or text. This capability may be the main feature of the service, or may be embedded in a chat box or similar. In the future, RDR may also consider expanding the standards to apply to generative AI that creates audio, video, 3D models, or other outputs. Consumer-facing means the system is available for direct access by the general public, including those who do not have extensive technical training, for example through a chat-style interface. This is distinct from systems that are only accessible by other software systems through an API (application programming interface) and experimental systems whose use is confined to corporate or academic researchers.

What is novel about RDR’s preliminary standards?

As far as we know, this is the first civil society publication detailing the policies and other information consumer generative AI services should publicly disclose. It’s also focused on consumer generative AI services rather than foundation models. While the AI governance space is moving too quickly for us to say with confidence that any particular governance idea has not been published in an academic paper, draft law, or blog post, the following are some from our standards, which we have never seen called for explicitly with regard to generative AI:

  • Numerical disclosures of policy enforcement moderation actions taken by the model, such as refusing to respond to a prompt;
  • A disclosure of how the model makes money for providing the service, and whether the model is designed to steer conversations toward particular outcomes;
  • An obligation for all consumer generative AI service providers to demonstrate that they have performed due diligence on the foundation model they choose to use, and providing users information about the foundation model;
  • Placing primary responsibility for users’ experience with the service provider rather than the foundation model, while simultaneously recognizing that the service may need to seek support from the foundation model provider in fulfilling its users’ rights.

You ask that companies allow individuals to remove the influence of their personal information. But isn’t machine unlearning infeasible with current technology?

Many types of AI, including generative AI, are trained on datasets that include personal information. This means that people’s personal information is being used to fuel the performance of the AI, often without them having given explicit consent.

The GDPR provides EU citizens with the right, subject to some exceptions, to remove the influence of data about them from an AI, making it “unlearn” their personal information. RDR supports this right and believes that comprehensive data protection laws in all jurisdictions should provide it. It is quite clear to us that, from a privacy perspective, this right should apply to personal information used by a consumer generative AI service to extend a foundation model.

However, there is some debate as to whether the erasure should be enforced against foundation models in particular. The only current way to remove the influence of a particular data point from foundation models is to retrain them, which costs tens of thousands to millions of dollars, requiring many hours of processing in state-of-the-art data centers. This is in contrast to service providers; they generally use much smaller datasets when they perform extra training to extend and specialize a foundation model, and removing the influence of a particular piece of information takes less time and money. 

However, the infeasibility of on-demand retraining should not not leave foundation model providers totally free of an obligation to have their systems unlearn personal information.here are a variety of problems created specifically by the use of personal information in training foundation models. For example, these models sometimes memorize and regurgitate personal information almost exactly or replicate an image or piece of text. This could make it easier to find sensitive information which, while technically public on the web, had been much harder to find without generative AI. 

Our preliminary standards take an intermediate approach with the goal of spurring further development of unlearning capabilities. Already, it’s technically possible for foundation models to maintain a public channel for people to identify information about themselves that they wish to have removed from the foundation model’s training data. Since foundation models are being retrained periodically, as technology develops, the company can pledge that the next version of the foundation model it creates will be trained on a dataset cleaned of these submitted pieces of information, and that they will eventually phase out the model version trained on the objectionable data. This is why our standards expect users to be able exercise the right to machine unlearning from foundation models, but with a time delay. 

In fact, computer scientists are developing ways to make it possible for an AI to unlearn specific data from its training set without a costly retraining of the entire model. We hope that by calling on foundation model providers to offer the right to machine unlearning, even if imperfectly, we can also exert pressure toward the rapid advancement of this research.

What did RDR choose not to address with these standards?

Labor and environmental rights: Many concerns have been raised about the environmental impacts of training large AI models, and their potential to displace workers and contribute to further concentration of wealth. While we believe these issues are important, they are outside RDR’s area of expertise.

Existential Risk: While hypothetical existential risk to the human race is not a completely far-fetched notion, we believe that it has dominated the conversation about advanced AI more than is appropriate, at the expense of more pressing, immediate problems which affect people—especially marginalized people—today.

Recommendations

Companies need to take transparency and accountability more seriously: Companies that wish to produce AI safely should commit to voluntarily adopting these standards. However, RDR recognizes this is unlikely to happen at sufficient scale and speed. The fact that recent calls to pause AI development seem to have mostly gone unheeded, further emphasizes the unlikelihood of companies taking the time needed to properly assess the harms of these new technologies as they compete at being the first to roll them out. The urgent need for well-considered regulatory intervention is therefore clear and we hope these standards can also help inform such discussions.

Regulatory interventions: There is a clear need for an overarching AI legal framework across jurisdictions. In the U.S., unsuccessful bills like the Algorithmic Accountability Act and the Algorithmic Justice and Online Transparency Act have attempted to address some AI issues without overarching regulation.

The EU’s upcoming AI Act seeks to provide overarching regulation for AI by clarifying legal liabilities and penalizing discrimination, algorithmic manipulation, and other harms. Though the Act has been lauded as an important step forward in regulating AI, experts have warned against holding it up as a “gold standard,” due to the risks of some harms being improperly classified as “low risk” and therefore facing insufficient regulation, among other concerns.

Indeed, RDR’s standards may actually highlight the need for specific additions to the law. For example, our standards require numerical disclosures of content moderation-like actions for generative AI (such as the system refusing to generate responses to certain prompts), analogous to the disclosures large platforms currently make about the quantity of content removed. This is not currently included in the EU’s legislation, yet future regulation should be expanded to include these types of transparency disclosures.

Let’s not forget about basic privacy laws: For AI legislation to provide real protection for users, all countries need comprehensive data protection laws. Many of the privacy-related risks stemming from generative AI, such as the incentive for companies to rapaciously collect data for use in training, are simply exacerbations of the harms of existing internet business models. Indeed, existing privacy legislation in many jurisdictions provides some safeguards that will be applicable to AI: For example, the GDPR provides safeguards for fully automated decision-making. Meanwhile, trying to pinpoint AI-specific risks without addressing the fundamental need for data protection should be a last resort, as it could result in regulatory gaps.

Next steps

Following a consultation process with human rights and AI experts over the course of summer 2023, RDR will transform these preliminary standards into measurable indicators. This will allow us to benchmark and compare prominent consumer generative AI services in a Generative AI Accountability Scorecard, to be published in late 2023. 

We will also engage proactively with generative AI companies, and explore opportunities to incorporate ideas from our standards into upcoming legislation, corporate policy, and civil society statements.

Get involved

Following this publication of preliminary standards, RDR is committed to working in concert with fellow digital rights organizations and stakeholder groups in further developing them. We are offering three ways to participate, all of which you can do by emailing methodology@rankingdigitalrights.org:

  • Provide feedback on the preliminary standards in any format.
  • Request to join our email list, created as a space for discussion and announcements about civil society and academic projects to evaluate the policies and transparency of generative AI services.
  • Request to participate in RDR’s formal consultation process on our new generative AI indicators, which will take place later in the summer of 2023. The consultation process will begin with a call for comment from all people who have already reached out.

Acknowledgements

We thank the community of experts who provided feedback on early drafts of these preliminary standards, including Rishi Bommasani, Prem Trivedi, David Morar, Nathalie Maréchal, and Lorenzo Pacchiardi.

Sign up for the RADAR

Subscribe to our newsletter to stay in touch!