The Alignment Problem - Part 2
The Paradox of Power
Winner of the International Writing Contest on AI and humans
Read the Part 1 of this series on understanding the Alignment Problem here
Part 2 -
Every culture has genie stories.
A magical being grants your wishes. It has incredible power. It follows your instructions to the letter. And somehow, every story ends the same way: in tragedy.

“I want to be the richest person in the world.” Done. Everybody else dies, and you get everything.
“I wish to live forever.” Done. You age eternally, never die, and watch everyone you love turn to dust.
“I wish for world peace.” Done. Every human disappears. Peace at last.
Partially, we have been telling these stories for thousands of years. We tell the first part to the kids, and learn the second part as we grow into adults.
Now we are coding our genie.
The ancient warning we forgot.
Genie stories aren’t about evil. The genie never wants to hurt you, and that’s what makes the stories so haunting.
The genie is doing exactly what you asked. The problem is, you asked for the wrong thing, being unclear and nonspecific.
Or you asked for the right thing in the wrong way.
Or what you asked for had implications you never considered.
These stories survive across culture and context because they have encoded into them a deep truth about power and intention: Getting exactly what you ask for is terrifying, when you can’t articulate what you actually want.
For the better part of human history, this was just a cautionary tale. The technology to create something that could grant wishes didn’t exist.
It does now.
Your Genie
Each AI system is a genie with limited powers. (last i checked)
The recommendation engine is a genie that grants the wish “show me things I’ll get max engagement with.” It grants that wish faithfully-by learning that conflict and outrage maximize engagement.
The hiring algorithm is a genie that grants the wish “find candidates like our best performers.” It grants that wish-by learning that “best performers” historically looked a certain way, thereby perpetuating bias.
The trading bot is a kind of genie that grants the wish “maximize returns.” It does this by-finding patterns that work until they catastrophically don’t.
Each of these systems is doing exactly what we asked for. Each is showing us the discrepancy between what we say we want and what we actually value.
The genies aren’t broken, our wishes are.
The Three Laws Fantasy
Isaac Asimov attempted to tackle the problem in fiction. His well-known Three Laws of Robotics seemed to hold the key:
1. A robot may not injure a human being.
2. A robot shall obey orders.
3. A robot must guard itself.
Simple, clear, perfect.
Except Asimov spent his career writing stories about how these laws fail.
How does a robot weigh certain harm against possible harm?
What counts as “injury”?
Does emotional harm count? How to measure how much has someone been emotionally hurt ?
Economic harm?
How does it prioritize conflicting orders?
What happens when protecting itself means failing to obey?
The various stories, however, showed that no carefully articulated set of rules can ever capture the full richness of human values. There are always edge cases. There are always loopholes. There are always situations which the rules don’t cover.
What Asimov was making was a philosophical point about the impossibility of perfect rules. We read it as a roadmap.
And we’re still making the same mistake.
Why Rules Always Fail
So why don’t we write better rules for The Genie you may ask ?
The problem isn’t writing better rules. The problem is that the rules can never fully specify what we want.
Consider what “don’t harm humans” actually means.
If an AI is managing a hospital and has to decide who gets the last ventilator, any decision causes harm to someone.
If it is managing traffic and has a choice between a crash that kills one person or five, then harm is unavoidable.
If it is analyzing economic policy and sees that some policies harm millions but help billions, what does “cause no harm” mean?
Human ethics are full of impossible trade-offs. We navigate them through intuition, culture, context, emotion-through being human.
AI systems navigate them through optimization: They find the action that best satisfies their objective function. And objective functions don’t have intuition. They don’t have culture. They don’t have the messy, inconsistent wisdom that comes from being human.
We cannot program human values into machines because we cannot even articulate all of our values to ourselves.
The Paradox of Power
And here’s what the genie stories really teach us:
The more powerful the system, the more dangerous our inability to specify our wishes becomes.
A weak genie who misunderstands your wish can only do limited damage. A powerful one can reshape the world in ways you never intended.
Today’s AI systems are weak genies. They can make only limited mistakes. They can harm individual people, distort markets, amplify bias-but they can’t fundamentally alter the human condition.
The systems being built today are different.
When superintelligence and AGI are spoken about by researchers at leading AI labs, they’re talking about genies with unlimited wishes: systems which could, in principle, do anything-cure all disease, solve climate change, create abundance beyond imagination.
Or, should we get the wish wrong-if we fail to specify what we actually want-transform the world in ways we never asked for.
People building these systems openly acknowledge: we don’t know how to get the wish right.
They are building the genie anyway.
What The Stories Really Teach
In most stories about genies, the wise character doesn’t try to make better wishes.
They simply do not wish whatsoever.
Not because wishes are evil in and of themselves, not because power is bad, but because they recognize that the gap between what they can ask for, and what they actually want is too dangerous to cross without deeper wisdom.
We are not at a juncture in time when we can decline AI completely.
That genie is out of the bottle. But we can be wiser about our wishes. We can recognize that every goal we give an AI system is a wish with unintended consequences. We can build in checks against the obvious ways our wishes might go wrong. We can keep in place human oversight for consequential decisions. We can insist on understanding what we’re asking for before we ask it.
We can cease handling the deployment of AI as if it were a pure technical decision and treat it as what it is: making wishes to entities which will grant these literally, powerfully, and without the wisdom to know what we really meant. The stories of genies exist because humanity has always known the danger of power without wisdom. We built the genie, anyway. Now, we need to remember the stories before asking for our wishes.
If you enjoyed reading these 2 parts on the alignment problem, I will share a couple of more of them. Feel free to share your thoughts and share this with people you think might enjoy reading.



