Why is AI biased in the first place?

AI models learn from historical data, and that data records human decisions that were often discriminatory, incomplete or unrepresentative. The model does not invent bias; it inherits and operationalises whatever bias was already present in the patterns it was trained on.

What is the difference between historical, representation and measurement bias?

Historical bias is when training data reflects past discrimination (e.g. biased hiring records). Representation bias is when certain groups are underrepresented in the data so the model performs worse for them. Measurement bias is when the metric used to evaluate the model itself encodes injustice (e.g. predicting arrests as a proxy for crime).

Can AI actually be made fair?

Yes, but not by accident. Fairness has to be a design requirement from the start: deliberate dataset curation, explicit fairness constraints, outcome-based evaluation metrics, community involvement and ongoing monitoring after deployment. It is achievable, but it is hard and currently underfunded relative to its importance.

Is biased AI worse than biased humans?

It can be, because it runs at scale and carries an aura of algorithmic objectivity that makes outcomes harder to question. A biased human affects one decision at a time; a biased model can shape millions of decisions a day while everyone assumes the math is neutral.

What can companies do to reduce AI bias today?

Audit training data for representation gaps, evaluate models on outcomes across demographic groups rather than just overall accuracy, involve affected communities in design and review, document data provenance transparently, and treat fairness as a product KPI rather than a compliance afterthought.

Back to blog

DATAETHICSMACHINE LEARNING

Can We Train Machines to Be Better Than Us?

If AI learns from us, can it ever rise above our worst patterns?

Sahir MaharajMay 10, 202610 min read

Glowing scales of justice rendered in data particles balanced over a flowing river of binary code on a deep navy background — We feed it our history and expect a better future. That math has never quite worked out.

There is a question I find myself returning to whenever I am deep in a data pipeline or reviewing a model's outputs, one that feels deceptively simple but gets more complicated the longer you sit with it. The question is this: if we built an AI that learned only from the best of what human beings have ever done, the most generous decisions, the most rigorous thinking, the most ethically consistent behavior, would it become better than us? And the follow-up, which is the one that really keeps me going, is: could we actually do that? Because the gap between the AI we are currently building and the AI that question imagines is enormous, and almost all of that gap is a data problem. Not a technical data problem. A deeply human one.

The systems we are building right now learn from the world as it is, not as we wish it were. They ingest the historical record of human decision-making, which is a record shot through with bias, inconsistency, cruelty, and structural injustice that we have spent centuries trying to correct. Hiring data that reflects decades of discriminatory practices. Criminal justice records that encode systemic racial disparities. Medical literature that has historically underrepresented women and non-white populations in clinical research. Loan approval histories that reflect neighborhood redlining. Content moderation decisions that vary wildly based on language and cultural context. When an AI learns from all of that, it does not learn the aspirational version of human values. It learns the operational one, the one we have actually enacted rather than the one we have claimed to hold.

What makes this a genuinely urgent problem rather than just a philosophical concern is that AI systems are increasingly being deployed in high-stakes decision contexts precisely because they seem more objective than humans. They do not have bad days. They do not harbor obvious prejudices. They do not make decisions based on whether they like your face. Those arguments have real appeal, especially in contexts where documented human bias has caused serious harm. But they rest on an assumption that turns out to be critically wrong: that data is neutral. Data is not neutral. Data is a record of what happened, and what happened was produced by humans with all the limitations and blind spots that entails. Feeding that record into a machine does not launder it into objectivity. It preserves the bias while adding the authority of algorithmic output, which is in some ways worse.

A tall stack of old weathered archive ledgers and folders on a wooden desk in warm window light with dust motes floating in the air — The training data is just history. And history was never neutral to begin with.

To think clearly about whether we can train better AI, it helps to understand specifically how bias enters the training process, because there is more than one pathway and they require different responses. The most discussed form is historical bias, where the training data reflects past discrimination that we would now consider unjust. The classic example is a hiring model trained on a company's historical hiring decisions, which may systematically underrepresent certain groups not because those groups are less qualified but because they faced barriers that kept them out of the applicant pool or caused their applications to be screened out by biased human reviewers. The model learns to replicate that outcome because replication of historical patterns is, in a narrow technical sense, exactly what it is optimized to do.

But there are subtler forms that get less attention. Representation bias arises when the training data simply does not include enough examples from certain groups for the model to learn accurate patterns about them. Medical AI trained primarily on data from well-resourced healthcare systems in high-income countries may perform significantly worse for populations whose health presentations, comorbidities, and treatment responses differ systematically from that training population. Facial recognition systems trained predominantly on lighter-skinned faces have demonstrated dramatically higher error rates for darker-skinned faces, with real consequences in contexts where those systems are used for identification. The model did not learn to discriminate intentionally. It learned from data that was unrepresentative, and the gaps in its performance track the gaps in its training.

There is also what I think of as measurement bias, which is perhaps the most insidious because it hides inside the very metrics we use to evaluate whether a model is performing well. If we measure the success of a predictive policing model by how accurately it predicts arrests, and arrests themselves reflect biased policing patterns, then a model that is highly accurate by that metric is accurately predicting bias rather than accurately predicting criminal behavior. The evaluation looks rigorous. The outcome replicates and potentially amplifies the original injustice. This is why the question of what we are measuring and whether that measurement captures what we actually care about is not a secondary technical concern. It is the central ethical question in AI development, and it does not have a technical answer.

Two smooth polished stone spheres of very different sizes resting on a flat wooden surface in soft natural daylight — Yes, the data is uneven. No, throwing up our hands is not a serious response.

Here is where I want to push back against a kind of resigned fatalism that sometimes settles over this conversation. Yes, the data we have is compromised. Yes, the historical record we are training on is shot through with human failure. But the conclusion that AI is therefore inevitably biased and there is nothing to be done is not supported by the evidence of what is actually possible with deliberate effort. There are research teams and organizations doing serious work on bias detection, dataset curation, algorithmic fairness constraints, and evaluation frameworks that measure outcomes rather than just accuracy. This work is hard, it is often underfunded relative to its importance, and it is genuinely technically complex. But it is producing results, and those results suggest that the gap between the AI we are building and the AI we could build with genuine commitment is not fixed.

The most promising approaches tend to involve treating fairness not as a post-hoc correction but as a design requirement from the beginning. That means interrogating training data before it enters a model, asking explicitly whose experiences are represented and whose are missing, and making deliberate choices about how to address those gaps. It means defining success metrics that reflect real-world equity rather than just statistical performance on a benchmark. It means involving communities that are likely to be affected by a system in its design and evaluation, not as a consultation exercise but as a genuine accountability mechanism. And it means building the institutional capacity to monitor deployed systems over time, because bias can emerge or shift as the world changes in ways that the training data did not anticipate.

None of this is simple, and I want to be honest about that. Training AI to be better than us in the relevant sense is not primarily a machine learning problem. It is a problem of human values, human governance, and human willingness to invest in getting this right even when getting it wrong is commercially cheaper and less contentious. The technology could do better. Whether we will choose to make it do better is a question about us, not about the machines. And the answer to that question, which is still genuinely open, depends on whether enough people in enough positions of influence come to understand that the cost of biased AI at scale is not just a technical error rate. It is, in the most direct possible sense, the perpetuation of injustice with greater speed and at greater scale than the biased humans it was supposed to replace.

A close up of a colorful mosaic of overlapping translucent glass tiles forming a diverse pattern lit by soft natural light — Better AI starts with whose stories make it into the training set.

I want to close with something more constructive than a catalogue of problems, because I think the constructive case is both real and underappreciated. The aspiration to train machines that are better than us is not naive. In fact, there are specific domains where AI is already demonstrating the ability to be more consistent, more equitable, and less subject to the worst human tendencies than human decision-makers in the same contexts. Studies comparing AI-assisted medical diagnosis to unassisted diagnosis have found that in certain conditions, the AI shows less variance across demographic groups than human physicians, who bring unconscious assumptions about which patients are likely to have which conditions. That is a real form of improvement that is worth pursuing deliberately rather than hoping will emerge accidentally.

The path toward AI that is genuinely better than us in morally meaningful ways runs through a few clear commitments. Radical transparency about training data, about what is in it, who it represents, and what it excludes, so that the biases baked in are at least visible rather than hidden. Genuine diversity in the teams building and evaluating these systems, because the blind spots in AI development track the blind spots of the people doing the developing, and those blind spots are narrower when the team is broader. Regulatory frameworks that hold deployed systems accountable for disparate outcomes, not just for input data quality, because the test of whether a system is fair is ultimately what it does to real people in real situations. And a cultural shift in the field toward treating fairness not as a constraint on performance but as a dimension of it.

The question of whether we can train machines to be better than us is ultimately a question about whether we are willing to be better than we have been. AI learns from what we give it. If we give it our history uncritically, it will replicate our history. If we give it our aspirations, our hard-won understanding of where we have failed and what fairness actually requires, we create at least the possibility of something that reflects the best of what we are capable of rather than the average of what we have done. That is a harder project than building a model that performs well on a benchmark. But it is the project that actually matters. And given how much influence these systems are going to have over how decisions get made in the world, it is the project we cannot afford to treat as optional.

DATAETHICSMACHINE LEARNINGFAIRNESSAI GOVERNANCE

View all

SUPERINTELLIGENCECONTROLFUTURE

Can Super Intelligence Really Be Controlled?

An honest read on the control problem, between fatalism and false comfort.

June 21, 202612 min read

CYBERNETICSETHICSGOVERNANCE

Why a Black-Box Approach in AI Is Not Recommended

What we lose when the systems running our world stop being legible.

June 20, 202610 min read

You might also like

Can Super Intelligence Really Be Controlled?

Why a Black-Box Approach in AI Is Not Recommended