The $100 Million AI Lesson Amazon Had to Learn the Hard Way

My college mentor gave me three questions to ask when I'm stuck on a tough decision:

1. Is it legal?

2. Is it ethical?

3. Is it for the betterment of the organization?

Here's what she taught me: These are three DIFFERENT things.

Something can be legal but not ethical.

Something can be ethical but not good for the organization.

Something can be good for the organization but not legal.

And here's the really complex part: What's "fair" or "ethical" depends on who you ask.

Different people have different definitions of fairness.

Different contexts require different approaches. What works in hiring might not work in lending. What's acceptable in one industry might be completely wrong in another.

I see the benefits and tradeoffs from all sides of these debates.

My goal in this series isn't to tell you what to think or what's "right."

My goal is to give you frameworks, information, and practical tools so YOU can make informed decisions about how to build AI responsibly in YOUR context, with YOUR values, for YOUR users.

AI ethics is complex because it's not one-size-fits-all. It requires nuance, context, and thoughtful decision-making.

The best we can do is educate ourselves about the challenges, understand the frameworks available, and make intentional choices, not just hope for the best.

Okay. With that said... let's get started.

The $100 Million AI Lesson Amazon Had to Learn the Hard Way

In 2014, Amazon had a problem.

They were hiring thousands of people every year. Recruiters were drowning in resumes. The hiring process was slow, inconsistent, and expensive.

So they did what tech companies do: they built an AI to solve it.

The goal was simple: Feed the AI ten years of resumes from successful Amazon employees. Let it learn what "good" looks like. Then use it to automatically screen new applicants and rank them from best to worst.

The team behind it? World-class ML engineers. People who'd built recommendation systems that worked for millions of products.

The resources? Basically unlimited. This was Amazon.

The timeline? They spent years perfecting this system.

The result?

It systematically discriminated against women.

How It Happened

Here's what the AI learned:

Amazon's tech workforce was predominantly male (like most of tech in the 2000s-2010s).

So the AI looked at ten years of resumes from "successful" Amazon employees and noticed a pattern:

→ Most successful employees were men
→ Men's resumes had certain language patterns
→ Men's resumes included certain experiences
→ Women's resumes looked different

The AI concluded: Resumes that look like men's resumes = good. Resumes that look like women's resumes = bad.

It started penalizing resumes that included:

The word "women's" (as in "women's chess club captain")
Graduates of all-women's colleges
Language patterns more common in how women describe achievements

The system didn't have a field for gender. It didn't explicitly say "reject women."

It just learned that the pattern of successful employees = male, and optimized for that pattern.

This is what makes AI bias so insidious. Nobody programmed discrimination into the system. The AI learned it from historical reality.

How They Found Out

Here's the thing: Amazon actually was testing the system. They noticed the bias in 2015, a year into development. They tried to fix it. Edited the algorithm. Removed the penalty for "women's." Adjusted the weights.

But they couldn't be sure they'd caught everything. Because here's the fundamental problem: When you train an AI on biased historical data, you bake the bias into the system.

They could patch the obvious issues. But what about the subtle patterns? The language differences they hadn't thought to check? The second-order effects of correlated factors?

They couldn't be confident the system was fair.

So in 2017, after years of development and iteration, they killed the project entirely.

This wasn't a failure of commitment.

Amazon wanted a fair hiring system. They invested millions. They had top talent. They were actively testing for problems. What they didn't have: A systematic framework for preventing bias before it got baked into the system.

They were trying to fix bias after the fact. Patch it. Smooth it over.

But you can't patch your way out of biased training data.

What they needed: → Pre-deployment bias testing protocols (with specific metrics and thresholds) → Clear fairness definitions (demographic parity? equalized odds? what exactly are we testing for?) → Accountability structures (who's responsible for catching this? who makes the go/no-go decision?) → Risk assessment methodology (how do we know which AI systems need the most scrutiny?)

All the things the NIST AI Risk Management Framework provides.

All the things they built in response to failures like this.

Why This Story Matters

This wasn't Amazon being careless.

This wasn't Amazon being malicious.

This was Amazon, with unlimited resources, world-class talent, genuine commitment to getting it right, still failing because they didn't have systematic frameworks in place.

If Amazon with their resources couldn't figure this out through trial and error...

What chance does your 12-person SaaS startup have?

What about the healthcare tech company building diagnostic AI?

What about the fintech launching an AI-powered lending product?

The answer isn't "don't build AI."

The answer is: Build it systematically. With frameworks. With testing. With accountability.

The Pattern Repeats

Amazon's hiring AI is just the most famous example. The pattern repeats constantly:

Healthcare: AI diagnostic tools that perform 20-30% worse for Black patients because training data over-represented white patients.

Criminal justice: Risk assessment algorithms that systematically score Black defendants as higher risk because they were trained on historical arrest data that reflected biased policing.

Mortgage lending: AI systems that deny loans to qualified applicants in majority-minority neighborhoods because historical lending data reflected redlining.

Facial recognition: Systems that can't recognize darker skin tones because the training datasets were 75%+ lighter-skinned faces.

Every single time: The team building it had good intentions. The AI just learned patterns from biased historical reality.

The gap?

Healthcare has ethics boards that rigorously test systems before deployment.

Academia has Institutional Review Boards (IRBs) that review research involving human subjects.

Financial services has strict regulatory oversight with clear accountability.

Tech?

We have the NIST AI Risk Management Framework.

Only 15% of companies actually use it.

The other 85%? They ship AI without systematic testing. Without governance. Without even basic frameworks for preventing harm.

They treat incidents as "bug fixes" to address after consumers complain.

That's not a strategy. That's a band aid.

What I have been building.

Over the past few years, I have been quietly watching AI build, reading articles, reading books, and testing software's. Over the past 8 months, it has become very apparent to me that we are going to see a major shift in how AI operates. Not from an ivory tower, but putting in the work, consuming as much content as I can from as many perspectives as I can. The people feel it, too.

The challenge is that the teams who genuinely want to for the right thing have to idea where to start.

There is nothing practical for real companies with real constraints:

5-person teams (not 50)
Limited budgets (not Amazon's resources)
Competing priorities (customers, revenue, product roadmap)
Weeks to implement (not years)

So I am trying to build the gap with what is missing and finding ways to make it more accessible. Practical frameworks that translate NIST AI RMF into steps you can actually take.

With real tools. With specific methodologies. With templates you can use.

Over the next few weeks I will be sharing a bunch of stories from different industries, some current events that are going on right now (like the new US Tech Force initiative) and more.

🎧GROWTH STRATEGY WITH ALYSSA EVANS Prefer audio? Every email is also a podcast episode. [Subscribe on Apple] [Spotify]

Stay tuned, stay open-minded, and stay informed.

Back to blog