Responsible Data Practices for Product Equity

Building equitable products requires balancing a deep understanding of the people using the product with responsible data practices. Collecting demographic data can be challenging, but this primer offers a starting point for ethical data collection.

It provides a framework for product teams to address common challenges and risks, and to begin standardizing practices that are often difficult to align. While not an exhaustive resource on all aspects of data collection, this primer presents crucial considerations for data practices in advancing Product Equity.

This guide encourages collaboration across different sectors, including civil society, academia, and industry, while providing resources to support further learning and the development of industry-wide standards. To join this effort, contact the Product Equity Working Group at TechAccountabilityCoalition@AspenInstitute.org. For more information or resources, visit the Product Equity Hub.

Who Should Read This

This evolving resource is primarily intended for product teams—such as researchers, designers, product managers, and strategists—and data teams, including scientists, architects, and analysts, to guide their work on Product Equity and responsible data practices. Practitioners in compliance, security, legal, and academia may also find this resource helpful.

The examples and case studies are based on extensive research and consultation with Product Equity practitioners and civil society groups.

What’s In This Guide

Introduction

Responsible Data Practices Before Collection

Consider Potential Harms and Risks

Draft a Pre-Analysis Plan

Responsible Data Practices During Collection

Determine How You Will Collect Data

Ensure Transparency

Gather Informed Consent

Protect Anonymity

Offer an Opt-Out Option

Clarify the “Who”

Conclusion

Introduction

Contextualizing Product Equity

To effectively build products for all, and especially for systemically marginalized communities, we need to deepen our relationship with the people using the products. Anyone engaged in product design and development needs to understand the diverse identities, contexts, and experiences that inform how people interact with the world—and with products. Product Equity is not only about eliminating and minimizing harm—it also seeks to introduce and enhance positive experiences, ultimately enabling people to thrive.

Doing so means intentionally creating an experience that empowers individuals to succeed, grow, and feel fulfilled in their interactions with a product, service, or platform. It goes beyond just meeting people’s basic needs or expectations, focusing instead on fostering positive, empowering experiences that help them achieve their goals, improve their well-being, and develop their potential. In the context of digital platforms, fostering individual thriving can include elements like personalized experiences, ease of use, accessibility, and support for long-term success, contributing to overall user satisfaction and loyalty.

To meaningfully engage with systemically marginalized communities and address their needs, we turn to data—because we can’t mitigate disparities without first understanding them.

Using Demographic Data to Advance Product Equity

Demographic data can be a powerful tool for assessing and improving equity and fairness in products, but there are multiple ways to analyze and apply this data once collected.

Strategies for Using Demographic Data

Here are some example approaches for using demographic data:

Assessing user experience and engagement: Product teams can compare how different demographic groups interact with their product to surface differences in usability, accessibility, or overall experience.
Identifying disparities in outcome: Teams can examine product outcomes by demographic segments to uncover areas where some groups may be underserved, excluded, or even harmed.
Informing inclusive design decisions: By understanding the needs and experiences of systemically marginalized communities, teams can guide design choices that create digital curb cuts—inclusive, accessible solutions that benefit all customers, not just specific groups.
Conducting fairness testing: For products leveraging algorithms, models, or other decision-making processes, fairness testing can help detect and mitigate biases in automated systems. However, it should be noted that statistical fairness testing in isolation is not enough to address algorithmic harms.

Companies already collect data about their customers, but that data is not always helpful for Product Equity purposes.

If companies do collect demographic data, it might be:

(1) collected in a way that’s not inclusive;

(2) segmented in a way that doesn’t tell us enough information about particular user needs and experiences; and/or

(3) collected in a way that makes the data unreliable for statistical analysis.

Responsible data practices require an eye toward harm mitigation. It also requires an acknowledgment that demographic data is not like other data. For example, multinational companies may be interested in improving experiences for individuals in the LGBTQIA+ community, but they also may operate in countries where identifying as LGBTQIA+ is criminalized. How might this company build inclusive products for this community without putting those individuals at risk of harm?

For further reading on the potential harms of demographic data collection:

McCullough, Eliza and Villeneuve, Sarah. Participatory and Inclusive Demographic Data Guidelines, pp. 5

Partnership on AI. “Appendix 5: Detailed Summary of Challenges & Risks Associated with Demographic Data Collection & Analysis.” Eyes Off My Data: Exploring Differentially Private Federated Statistics To Support Algorithmic Bias Assessments Across Demographic Groups

Principles for Responsible Data Collection: An Overview

(1) Acknowledge historical mistrust and mitigate potential harms.

(2) Prioritize user rights, such as consent and ownership, user agency—the ability and power of individuals to make choices, control their actions, and influence their experience within a product—and the highest level of data protection laws available to your geographic scope or globally.

(3) Promote company transparency and accountability to the customer.

(4) Conduct regular audits and assessments to ensure data collection methods remain aligned with intended use cases and evolving product needs.

These actions represent cycles of iteration and review. A commitment to responsible data collection means not only establishing sound processes and principles but also actively seeking feedback and adapting to evolving laws, shifting user needs, and other relevant factors.

This primer gives an overview of the main concerns product practitioners and their teams should consider before and during demographic data collection. Our goal is to empower practitioners to collect data safely and effectively, making harm mitigation a foundational part of how products are developed.

This primer is not the only effort in the broader Product Equity space to address responsible data collection. Miranda Bogen’s report, Navigating Demographic Measurement for Fairness and Equity (May 2024), highlights the growing need for AI developers and policymakers to identify and mitigate bias in AI systems, emphasizing responsible data practices for measuring fairness.

Return to Table of Contents

Responsible Data Practices Before Collection

Consider Potential Harms and Risks

There are ethical dilemmas associated with collecting and using data to make decisions that impact people’s lives. Responsible data practices are essential in addressing these challenges. For example, considering how an individual’s gender or marital status might impact their creditworthiness raises important questions about fairness and transparency, as well as the potential for discrimination based on data-driven insights and decisions.

Examples of Risk

1

Discriminatory practices and bias in data collection: Algorithms may inadvertently or intentionally discriminate against certain groups based on characteristics, experiences, and/or demographics due to incomplete data collection. For example, Daneshjou et al. (2022) documents how dermatology AI models trained on datasets predominantly composed of lighter-skinned patients performed worse at predicting skin diseases for patients with darker skin tones.

2

Discriminatory practices and bias in data-based decision-making: Biased data can perpetuate and even exacerbate historical and social inequalities. For example, in 2018, Amazon scrapped an AI-based recruiting tool because it favored men candidates over women. The tool was trained on resumes submitted over a 10-year period, most of which came from men, reinforcing the existing gender imbalance in tech. The AI system penalized resumes that included terms like “women’s” and downgraded graduates from women-only colleges. This also touches on the need for fairness-aware machine learning and algorithmic transparency to combat bias.

3

Unequal access and experiences: Companies use collected data to develop products and services that may not benefit all individuals equally. Early versions of facial recognition technology were less effective at recognizing people with darker skin tones, a problem rooted in the use of biased training data that lacked sufficient representation of darker skin tones. This created unequal experiences for some people and perpetuated the digital divide by making certain features less accessible to some systemically marginalized groups.

Due to the potential harms listed above, ask yourself and your stakeholders if data collection is the only way forward. Perhaps there are other ways of understanding your target or current audience, such as data you can infer by proxy or publicly available sources of data.

For further reading on algorithmic harms and responsible data practices, consider these books:

Algorithms of Oppression: How Search Engines Reinforce Racism (2018) by Dr. Safiya Umoja Noble
Unmasking AI: My Mission to Protect What Is Human in a World of Machines (2023) by Dr. Joy Buolamwini
Your Face Belongs To Us: A Secretive Startup’s Quest to End Privacy as We Know It (2023) by Kashmir Hill
The Algorithm: How AI Decides Who Gets Hired, Monitored, Promoted, and Fired and Why We Need to Fight Back Now (2024) by Hilke Schellmann
Against Technoabilism: Rethinking Who Needs Improvement (2023) by Ashley Shew

Defining your use case ahead of time helps protect against collecting more sensitive data than is necessary. Establishing clear use case boundaries can also reduce the likelihood of later misuse. If you don’t know why you’re collecting a category of personal identity data, chances are your customers won’t know either, which can damage credibility and trust.

Questions to Ask Before Embarking on Data Collection

Why do we need more data?
What data do we already have that can help us answer this question?
What data can we generate or build (e.g., synthetic data)?^*

^*Note: Carefully consider the tradeoffs around this approach. For more on the limitations of synthetic and proxy data, see:

Rocher, Luc. “Misinterpretation, privacy and data protection challenges – putting proxy data under the spotlight,” 2022

McCullough, Eliza and Villeneuve, Sarah. Participatory and Inclusive Demographic Data Guidelines, pp. 9

Return to Table of Contents

Draft a Pre-Analysis Plan

To collect demographic data effectively, one needs to know how to define what data is needed and how the data will be used. For the purposes of this primer, we define demographic data as measurable traits of any given population such as, but not limited to, age, gender, and race.

We acknowledge there are many types of personal data, including demographic data. For more on defining personal data, see The Wired Guide to Your Personal Data (and Who Is Using It), 2019.

Before data collection, the practitioner’s goal is to understand how the data will be used in as much detail as possible. This practice may include any of the approaches outlined in the Using Demographic Data to Advance Product Equity section, including fairness testing. Fairness testing, however, is only necessary if it is a focus of your analysis, and will be explored further in the sections below.

The Pre-Analysis Plan

As a best practice, we recommend drafting a pre-analysis plan before beginning data collection. This plan is a short document that details the following:

Data Landscape Audit
Context and Product Alignment
The Type of Fairness or Equity Being Assessed
Data Requirements
The Statistical Approach For Fairness Testing
Potential Outcomes and Remediations

This pre-analysis plan is designed to establish a shared understanding among all stakeholders regarding safe and effective data collection and the subsequent steps.

The following sections detail each of the six components of the pre-analysis plan, outlining key considerations, promoting responsible data practices, and identifying relevant stakeholders.

A Note on Fairness Testing

Fairness testing helps advance Product Equity by ensuring that algorithms, models, and decision-making systems operate without bias. Product Equity strategies shape fairness testing by identifying areas where marginalized groups may face disproportionate impacts. Together, they promote inclusive and fair system design. However, statistical fairness testing alone cannot address algorithmic harms. Mathematical definitions of “fairness” often overlook systemic discrimination, racial capitalism, and the ethics of risk classification. For more, see Rodrigo Ochigame’s The Long History of Algorithmic Fairness (2020).

Data Landscape Audit

The first step in the process is to understand the structures, governance, and datasets already in place. Companies rarely start with a blank slate, and a practitioner should not dive into data collection without developing a sense of the landscape.

Key Questions to Guide your Data Audit

What data does your company already collect, and how is it collected?
Who has access to this data?
Are there opportunities to make collection mechanisms more transparent to customers?
What consent mechanisms are in place?
Who holds the company accountable for its data collection practices? Is it customers, internal policies, regulators, or other stakeholders (internal and/or external)?

Assess the Impact on Marginalized Communities

Data assessments should evaluate the impact of your company’s data collection practices on systemically marginalized groups to identify areas for improvement. Explore other data collection and retention policies that exist. Your team might determine that regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) are not doing enough for your use case.

For example, see Meta’s approach, which involves secure multiparty computation (SMPC) to encrypt and split survey responses into fragments. These encrypted shares are distributed among third-party facilitators who aggregate the data without revealing individual responses. This approach allows Meta to analyze the data for fairness across different racial and ethnic groups without violating privacy.

Are there better models and/or approaches your company can adopt to ensure responsible data practices?

Prioritize Harm Minimization

Minimizing harm by anticipating what could go wrong by collecting data is a top priority. Harms can arise from improper use, insecure storage, or data leaks, and they can manifest in many forms.

To address these risks:

Conduct “What could go wrong?” exercises as well as red team and blue team workshops with stakeholders to identify potential harmful outcomes and ideate safeguards to mitigate those risks. To capture a wider array of potential harms, consider intentionally expanding who is included in these exercises.
Apply harms modeling across multiple dimensions—such as environmental impact, regulatory compliance, safeguards, and business risks—to effectively determine next steps.

Review Retention Policies

Even if your company has established data retention policies and timelines, revisit those and decide if stricter policies are needed.

Tips for data retention:

Only keep identity data for as long as it is necessary for your defined use case.
Determine who in your internal team needs to have access and who does not. Establish these permissions before data collection begins.

Regularly Purge Data

To minimize risks associated with data breaches, regularly review and purge data that is no longer needed. Data leaks or theft can expose individuals to malicious actors, identity theft, financial loss, and other serious consequences. People who hold systemically marginalized identities are more vulnerable to these risks, and leaked data may expose these communities to additional negative—potentially lifelong—consequences.

Return to Table of Contents

Context and Product Alignment

Equity assessments and fairness tests must be aligned with the product being evaluated. If the product is designed to address a specific user experience, the focus should be aimed at measuring disparities within that experience. This, too, should inform the design of the assessment or fairness test.

The nature of the product also shapes the discussions that follow the evaluation. Findings from an assessment or fairness test do not exist in a vacuum; any burdens or benefits identified for one group need to be weighed against other considerations. If the goal is to modify the product based on the findings, product teams should set realistic expectations about what can and can’t be changed.

Key Questions to Understand the Product

Why was the product developed?
What problem is it designed to solve?
Who is the target audience?
Is it a new product or an iteration of an existing one?
Will the product be evaluated before rollout?
What is the timeline for its release?
Does the product rely on external vendors or processes that also require testing?

Key Stakeholders

Collaboration with product managers and engineers is vital. They can provide insights into what aspects of the product are most important and what changes are feasible. Including these teams in the pre-analysis process builds goodwill and secures their buy-in into the process.
Each participant in this process—lawyers, product managers, data scientists, UX researchers, and others—brings a unique perspective informed by their domain expertise. This stage is a chance for the researcher or fairness tester to understand the product team’s concerns and goals while building a collaborative relationship to effectively interpret the eventual findings.
Consulting external sources, such as law firms, academics, and NGOs, can provide valuable broader perspectives that may be difficult to capture internally.

The Type of Fairness or Equity Being Assessed

At this stage, the team should align on the specific types of fairness or equity that are the primary concern, ensuring that responsible data practices are integrated into the process.

For example, if the team is conducting an equity-focused assessment, they might consider questions like:

Are certain demographic groups disproportionately facing barriers or gaining benefits from the product?
Are there differences in how the product or service is used or experienced by different demographic groups?
Do certain groups face higher risks, costs, or limitations in benefiting from the product?

In contrast, a fairness test may be used to evaluate an algorithm. For instance, suppose a new algorithm is developed to decide whether an individual qualifies for a loan. The team might ask:

Is there racial disparity in who receives a loan offer?
Relative to past default rates, is there racial disparity in who is approved for a loan?
Are there differences in the average size of loans offered?
Does the effort required to secure a loan differ across demographic groups?
Are there disparities in other positive outcomes?

Key Stakeholders

Gather input from a diverse range of stakeholder teams, including:

Product
Engineering
Data Science
Legal and policy
Communications and marketing
Employee resource groups and other relevant working groups
Most importantly, the individuals who will be impacted by the product

Engaging with these groups will help support a robust discussion about what fairness and equity means in the specific context of the product being tested.

Return to Table of Contents

Data Requirements

Once the outcome to be measured is selected, the pre-analysis plan should specify which data points are required. The type of demographic data needed will depend on the dimensions being assessed—for instance, is the evaluation going to measure disparities by national origin, gender, race, or some other attribute? You will want to prioritize the dimensions most relevant to your product; there is no one-size-fits-all solution.

For example:

Information on skin tone may be critical for ensuring a camera application is inclusive and performs equitably across its diverse user base.
For a personal finance app, segmenting metrics by skin tone would be inappropriate and potentially even be subject to misuse and user harm, barring a very specific reason for doing so.

The team should also identify any existing data relevant to the assessment or fairness test. For example, in the previous loan scenario, past default rates or income might provide valuable context. The goal is to identify all data that might help explain a disparity.

Intersectionality and Granularity

When choosing attributes, consider intersectionality to address the nuanced experiences of people who belong to multiple marginalized groups. Without capturing intersectionality, there is a risk that these individuals’ experiences are lost in the aggregation of one of the identities. For example:

A Black woman or a disabled LGBTQ+-identifying individual might face distinct challenges when using your application that might be overlooked if you just analyzed race or LGBTQ+ status in isolation.

The level of granularity (or disaggregation) is another facet of acknowledging the nuanced experiences of your customers. When considering user groups, examine the level of granularity or coarseness a dimension of identity requires. For example:

Is it sufficient to know that a person is “Asian,” or is it crucial to distinguish that they are “Japanese”?
Is it enough to know a person has accessibility needs, or should it be specified that they are visually impaired?
Is it adequate to know that an individual selects the “LGBTQIA+” checkbox or is it essential to know that the person is transgender?

Since many identity categories are based on social constructs—many of which are not created by members within those identity categories—it is important to apply responsible data practices to determine when granularity is necessary.

Avoiding Harm in Identity Data Usage

While identity data is valuable for measuring inclusivity, using such data to directly personalize user experiences without explicit, opt-in consent can be harmful. For example:

A recommendation engine that incorporates race or ethnicity predictions implicitly assumes that an individual’s preferences are, at least in part, determined by their racial or ethnic identity, which can effectively tokenize them and deny their individuality.
Personalization is often better served by emphasizing demonstrated behavior and directly expressed preferences over immutable identity characteristics.

Striking the Right Balance

The decision on the level of granularity is not one-size-fits-all:

On the one hand, minimizing data collection can have privacy benefits, simplify the process, and support smaller-scale datasets.
On the other hand, overly coarse granularity risks grouping identities in ways that obscure meaningful differences in the way your product is used.

Deciding on the right balance between intersectionality and disaggregation should be guided by the size of your data set. This is often an iterative process. This resource from the Partnership on AI grapples with the privacy/accuracy trade-offs at length. It can help anticipate challenges during the analysis phase and inform the establishment of feedback mechanisms and monitoring practices to validate that your chosen trade-off yields useful insights.

Key Stakeholders

Data scientists can provide guidance on what data is required.
Legal teams can assess which variables may have substantive legal implications.

This process ensures that the data collected is both relevant and compliant, setting a strong foundation for equity and fairness evaluations.

Return to Table of Contents

The Statistical Approach For Fairness testing

If the team is pursuing a fairness test, the pre-analysis plan should specify the statistical approach to assess fairness, once there is agreement on the type of fairness being tested, the groups of interest, and the relevant variables. For example:

To assess whether the average outcome for group 1 differs from that of group 2, a t-test of averages can be performed.
If there are confounding variables—e.g., group 1 is more likely to be younger, have longer tenure on the platform, or live in a city—a multiple regression analysis that controls for these variables would be more appropriate.
For products undergoing an A/B test, fairness can be assessed by looking at the causal impact of the product.

Describing statistical tests in advance is considered the gold standard in the social sciences because it enforces discipline and mitigates hindsight bias. This upfront specification helps mitigate the risk of teams inadvertently manipulating testing by running repeated analyses to achieve more favorable results. Even in good faith, a team might adjust new variables after finding a problematic result.

A well-defined pre-analysis plan clarifies the statistical approach from the outset, highlighting any deviations from the planned approach. Responsible data practices ensure transparency in these decisions. Departures are often warranted as new things are learned or new data comes in, but being clear that these are departures from the plan is key.

Potential Outcomes and Remediations

With the data and testing approach clearly outlined, the pre-analysis plan should anticipate potential outcomes and remediations.

In the plan, write out all possible findings—e.g., group A does worse than group B; group A is equal to group B; group A does better than group B; and so on. For each of these findings, describe potential remediations, like retraining an algorithm, modifying the product, delaying the product launch, and so on.

Key Stakeholders

All teams should weigh in:

Engineering teams can assess technical feasibility.
Product teams can evaluate the practicality of remediations.
UX teams can gauge the impact of changes on the intended groups.
Legal teams can ensure defensibility with regulators.
Employee Resource Groups (ERGs) can provide insights into whether proposed changes will have meaningful impacts for their respective communities.

The period before data collection should be devoted to outlining a thorough pre-analysis plan. In doing so, the team can coalesce around a strategy and answer all the relevant questions necessary to embark on responsible data collection.

Return to Table of Contents

Responsible Data Practices During Collection

Once you’ve determined your use cases, established policies, and built safeguards, consider user rights around transparency, consent, and ownership during the data collection process.

Determine How You Will Collect Data

Consider Co-Design

When planning data collection, consider adopting a co-design approach, which emphasizes collaboration with participants and product team stakeholders to create solutions that are both relevant and effective for those impacted.

Ideally, this process is owned by user experience (UX) stakeholders, such as UX researchers and UX designers. Co-design fosters direct engagement with communities, enabling teams to build with, rather than just for, these groups

Key elements of collaboration include:

Aligning research goals with participant needs and experiences
Providing fair compensation
Utilizing co-design methodologies
Sharing results in a way that benefits all involved organizations and individuals

Offer Self-Identification

Data collection should also encompass self-identification (self-ID), which allows people to share their demographic information and capture their unique backgrounds and experiences.

Tools & Collection Methods

When selecting tools or vendors for data collection, confirm they align with your team’s privacy and security standards, as third-party platforms may have differing safeguards.

Additionally, determine whether data collection will be conducted in-person, online, within a product experience, or separately. Each method requires careful consideration of user rights, transparency, consent, and data ownership.

Ensure Transparency

Transparency is essential in explaining the purpose for data collection and addressing common concerns such as privacy, data storage, and retention.

Consider creating a Frequent Asked Questions (FAQ) resource to answer key questions like, Who will have access to this data? and What data will they access? Use clear and accessible language so that all participants, including those with disabilities or limited English proficiency, can understand. Keep communications concise and link to more detailed policies as needed.

Legal review of company policies and user communication is recommended at this stage. Clearly articulate the value exchange—how data collection will benefit participants— as this fosters trust and helps them make informed decisions.

Informed consent is a critical process often facilitated through written documents, such as Non-Disclosure Agreements (NDAs). However, participants may overlook these materials or be overwhelmed by legal jargon. To promote genuine informed consent, include a verbal explanation in moderated data collection settings, outlining the research approach, data types, access permissions, and any sensitive topics that will be discussed.

Obtain affirmative consent for each element shared, and emphasize participants’ right to withdraw at any time without any negative consequences. To enhance understanding, consider using visual aids or videos, similar to Patient Decision Aids (PDAs) in healthcare, which help individuals make informed decisions based on their values and preferences.

Protect Anonymity

Anonymity is another important consideration. Clearly communicate the actual level of anonymity participants can expect and their rights regarding data retention and retrieval. Participants should be made aware of the risks associated with sharing identifying information, from severe consequences like exposure to harm due to their identity or experiences, to milder yet still significant risks, like exclusion from systems. Never share personally identifiable information (PII), and ensure participants are aware of this from the outset.

Before launching externally, conduct internal testing with representative groups, such as employee resource groups (ERGs), to gather feedback and make necessary adjustments.

Offer an Opt-Out Option

Always offer participants the option to opt-out or refuse participation in any user research or data collection process. Additionally, individuals should have the option to withdraw their data after it has been collected. This mechanism plays a crucial role in ensuring that data collection is ethical, respects privacy, and adheres to consent principles.

Clarify the “Who”

Deciding which groups to include in data collection should align with the predetermined research goals, intended use cases, as well as the groups you may be co-designing with.

It is important to also address potential biases in the dataset. You may aim for proportional representation to mirror the general population or over-sample specific groups for more granular insights.

For example, oversampling is vital for disaggregating data by race and generating meaningful results for smaller populations like American Indian/Alaska Native and Asian American/Pacific Islander groups, often overlooked in broad categories. Even if you’re targeting a narrow market, it is beneficial to include diverse perspectives to gain a broader understanding or uncover contrasting insights.

Return to Table of Contents

Conclusion

We hope this guide serves as a valuable starting point for product and data teams to adopt responsible data practices in advancing Product Equity. By using this resource, teams can better navigate the complexities of ethical data collection, minimize potential harms, and build more inclusive, equitable products. Future iterations will incorporate broader insights and case studies to further refine and strengthen this resource.

We invite you to collaborate with the Product Equity Working Group to help shape industry-wide best practices. Through collective effort, we can accelerate change and work toward a future where more people have access to all that tech has to offer.

If you are interested in joining the Product Equity Working Group, please email us at TechAccountabilityCoalition@AspenInstitute.org. For more Product Equity resources or to learn about the Product Equity Working Group, visit our Resource Hub.

A Note On Language

Across the Product Equity resources published on Aspen Digital, we use the term systemically marginalized to describe individuals and communities who have historically faced systemic injustices, continue to face them today, or are newly marginalized due to evolving structural disparities. These forces shape how they engage with or are excluded from digital products.

We believe the language we use to talk about people is immensely important, alive and evolving, context-specific and connotative, and highly political. To ensure intentionality in our language, we carefully reviewed how companies, civil society organizations, and academic institutions are talking about vulnerable communities. Based on these considerations, we have adopted the term systemically marginalized.

As an organization, we aim for inclusion in our language and recognize that grouping such a diverse set of communities together into one umbrella term may cause some generalizations. We also aim for precision and will specify wherever possible to honor the unique experiences, histories, challenges, and strengths of communities. We hope this language helps us advance the emerging field of Product Equity and welcome the iterations and evolutions this term may take in the upcoming years of this work.

Return to Table of Contents

ACknowledgements & Gratitude

We extend immense gratitude to the members of the Product Equity Working Group, both past and present, for sharing their time and expertise over the past two years to develop this foundational guide on responsible data practices. Their perspectives were crucial in ensuring we approached this work with care and consideration. A special thanks to our Chair, Dr. Madihah Akther, who drove this initiative forward ensuring care and consideration in every step of the process.

Thank you to the subject matter experts who reviewed drafts of this primer and were immensely generous with their time and expertise:

Miranda Bogen
Daniel Camacho
Dr. Symone Campbell
Dr. Jen King
Eliza McCullough

Jaryn Miller
Tammarrian Rogers
Alex Rosalez
Dr. Dan Svirsky
Rebekah Tweed

This effort was led by the following team members at Aspen Digital:

Zaki Barzinji, Senior Director, Empowered Communities
Cindy Joung, Product Equity Lead
Shreya Singh Hernández, Senior Research Manager, Tech Accountability Coalition

Responsible Data Practices for Product Equity by Aspen Digital is licensed under CC-BY 4.0.

Responsible Data Practices for Product Equity

Who Should Read This

What’s In This Guide

Introduction

Contextualizing Product Equity

Using Demographic Data to Advance Product Equity

Strategies for Using Demographic Data

Principles for Responsible Data Collection: An Overview

Responsible Data Practices Before Collection

Consider Potential Harms and Risks

Examples of Risk

1

2

3

Questions to Ask Before Embarking on Data Collection

Draft a Pre-Analysis Plan

The Pre-Analysis Plan

A Note on Fairness Testing

Data Landscape Audit

Key Questions to Guide your Data Audit

Assess the Impact on Marginalized Communities

Prioritize Harm Minimization

Review Retention Policies

Regularly Purge Data

Context and Product Alignment

Key Questions to Understand the Product

Key Stakeholders

The Type of Fairness or Equity Being Assessed

Key Stakeholders

Data Requirements

Intersectionality and Granularity

Avoiding Harm in Identity Data Usage

Striking the Right Balance

Key Stakeholders

The Statistical Approach For Fairness testing

Potential Outcomes and Remediations

Key Stakeholders

Responsible Data Practices During Collection

Determine How You Will Collect Data

Consider Co-Design

Offer Self-Identification

Tools & Collection Methods

Ensure Transparency

Gather Informed Consent

Protect Anonymity

Offer an Opt-Out Option

Clarify the “Who”

Conclusion

Share Your Thoughts

A Note On Language

ACknowledgements & Gratitude

THE BEST OF ASPEN DIGITAL, RIGHT IN YOUR INBOX.

THE BEST OF ASPEN DIGITAL, RIGHT IN YOUR INBOX.

Lists*