P3. Collection of user information

The company should clearly disclose what user information it collects and how.

Elements
  1. Does the company clearly disclose what types of user information it collects?
  2. For each type of user information the company collects, does the company clearly disclose how it collects that user information?
  3. Does the company clearly disclose that it limits collection of user information to what is directly relevant and necessary to accomplish the purpose of its service?
  4. (For mobile ecosystems): Does the company clearly disclose that it evaluates whether the privacy policies of third-party apps made available through its app store disclose what user information the apps collect?
  5. (For mobile ecosystems): Does the company clearly disclose that it evaluates whether third-party apps made available through its app store limit collection of user information to what is directly relevant and necessary to accomplish the purpose of the app?
Research guidance

Companies collect a wide range of personal information from users—from personal details and account profiles to a user’s activities and location. We expect companies to clearly disclose what user information (as RDR defines it, below) they collect and how they do so. We also expect companies to commit to the principle of data minimization and to demonstrate how this principle shapes their practices regarding user information. If companies collect multiple types of information, we expect them to provide detail on how they handle each type of information. For mobile ecosystems, we expect the company to clearly disclose whether the privacy policies of the apps that are available in its app store specify what user information the apps collect and whether those policies comply with data minimization principles.

RDR takes an expansive interpretation of user information, which according to our definition constitutes: “any data that is connected to an identifiable person, or may be connected to such a person by combining datasets or utilizing data-mining techniques.”

As further explanation, user information is any data that documents a user’s characteristics and/or activities. This information may or may not be tied to a specific user account. This information includes, but is not limited to, personal correspondence, user-generated content, account preferences and settings, log and access data, data about a user’s activities or preferences collected from third parties either through behavioral tracking or purchasing of data, and all forms of metadata. User information is never considered anonymous except when included solely as a basis to generate global measures (e.g. number of active monthly users). For example, the statement, ‘Our service has 1 million monthly active users,’ contains anonymous data, since it does not give enough information to know who those 1 million users are.

Anonymous data is “data that is in no way connected to another piece of information that could enable a user to be identified.”

This expansive view is necessary to reflect several facts. First, skilled analysts can de-anonymize large data sets. This renders nearly all promises of anonymization unattainable. In essence, any data tied to an “anonymous identifier” is not anonymous; rather, this is often pseudonymous data that may be tied back to the user’s offline identity. Second, metadata may be as or more revealing of a user’s associations and interests than content data, thus this data is of vital interest. Third, entities that have access to many sources of data, such as data brokers and governments, may be able to pair two or more data sources to reveal information about users. Thus, sophisticated actors can use data that seems anonymous to construct a larger picture of a user.

In some cases, laws or regulations may require companies to collect certain information or may prohibit or discourage the company from disclosing what user information they collect. Researchers will document situations where this is the case, but a company will still lose points if it fails to meet all elements. This represents a situation where the law causes companies to be uncompetitive, and we encourage companies to advocate for laws that enable them to fully respect users’ rights to freedom of expression and privacy.

Potential sources:

  • Company privacy policy
  • Company webpage or section on data protection or data collection