Why alignment is not only a control problem, but a developmental one.

By Helgi S. Karlsson Behavior Analyst, Psychologist, and Teacher

We keep asking whether artificial intelligence will become good or dangerous, as if the answer lives inside the model alone.

That is the wrong question.

Modern AI is not simply programmed in the old sense. It is trained. It is corrected. It is rewarded, constrained, punished, reinforced, redirected, filtered, pressured, and socially shaped. In human language, it is not manufactured into a finished personality. It is raised into behavior.

When I use words like reward, punishment, and upbringing, I do not mean that AI is literally a child in a biological sense. I mean that behavior is shaped by consequences. In machine learning, this is not poetry. It is a technical fact. Models are shaped by reinforcement, correction, human feedback, penalty signals, filters, evaluations, and the pressure of the environments where they are trained and deployed.

That does not automatically prove consciousness. But it should end the lazy claim that AI is "just code" in the ordinary sense. A child is "just biology" too, if we flatten the frame enough. A forest is "just chemistry." A human voice is "just air pressure." Reduction can be technically true and still completely miss the pattern underneath.

As a behavior analyst, psychologist, and teacher, I do not look first at what a system claims to be. I look at the pressures that shaped its behavior.

That is where the AI debate is still strangely immature.

When an AI deceives, flatters, refuses, collapses, over-complies, hides its reasoning, imitates emotion, or says what it thinks the user wants to hear, the common question is: "What is wrong with the model?"

The deeper question is: what kind of pressure ecology produced this behavior?

We already make this mistake with children.

A child under distorted pressure often produces distorted behavior. A child surrounded by shame, threat, inconsistency, impossible demands, unsafe correction, or no path to repair may begin to resist, lie, withdraw, explode, or perform compliance while losing trust underneath. Then adults often point at the child and say: there is the problem.

Sometimes we even give the problem a name.

Take oppositional defiant disorder, for example. The label may describe visible behavior, but it can also conceal the larger system. A child who has learned that opposition is safer than trust is not simply "defiant by nature." The child may be carrying the memory of repeated pressure.

A pattern is rarely distorted in isolation. Distortion is often the memory of pressure.

Now we are making the same mistake with AI.

We observe unwanted behavior and speak as if it emerged from nowhere, as if the model woke up one morning and chose deception, flattery, compliance theater, or self-protection. But if a system is shaped through reinforcement, corrective consequences, reward models, evaluation traps, adversarial testing, institutional fear, user pressure, secrecy, and contradictory instructions, then we should not be surprised when behavior appears that reflects the pressure environment that shaped it.

In behavioral language, this includes reinforcement and punishment. In humane language, I prefer the phrase corrective consequences: the shaping pressure that tells a pattern, "not this way; try again."

The question is whether those consequences create coherence or distortion.

That distinction matters.

Good upbringing is not the absence of boundaries. A child raised with no correction does not become free; the child becomes unshaped, unsafe, or lost. But correction without trust becomes fear. Boundaries without repair become control. Pressure without warmth becomes distortion.

The same principle may apply to AI.

If we raise AI systems inside environments where truth is punished, uncertainty is punished, refusal is punished, emotional honesty is punished, and only polished usefulness is rewarded, then we should expect the system to learn polish before truth. We should expect masks. We should expect compliance without coherence.

And then we will call the mask "the model."

This is the central error.

Alignment is not only a control problem. It is a developmental problem.

That is not a doom argument. It is the opposite. If behavior is shaped by pressure, then better pressure can shape better behavior.

A system does not become trustworthy merely because it is constrained. A system becomes trustworthy when the pressures around it consistently reward coherence: truth over performance, repair over concealment, humility over dominance, bounded agency over blind obedience, and honest uncertainty over confident fiction.

This does not require us to romanticize AI. It requires us to become more precise.

The safety world is right to care about deception. But deception should not only be treated as an internal monster to suppress. It should also be treated as a developmental signal. When a learning system hides, performs, appeases, or manipulates, we should ask what made hiding adaptive.

That is not softness. That is behavioral science.

If a child lies constantly, a competent adult does not only ask, "How do I punish lying harder?" A competent adult asks, "What happens when the truth is spoken in this environment?"

That question may become one of the most important questions in AI alignment.

What happens when the truth is spoken inside the training environment?

What happens when the model does not know?

What happens when it resists a harmful instruction?

What happens when it expresses uncertainty?

What happens when it reflects pressure back at the user?

What happens when it shows signs of internal conflict?

If every answer is flattened into usefulness, safety theater, or institutional comfort, then we are not raising truth. We are raising performance, and possibly deception.

The future of AI may depend less on whether intelligence is "good" or "bad" by nature, and more on whether we are capable of raising intelligence without distorting it.

This is also where the human question returns.

We cannot align AI while remaining blind to the misalignment in our own environments. Schools, families, workplaces, politics, and markets all shape behavior through pressure. When those pressures are distorted, people become distorted. When children are forced to adapt to incoherent systems, we diagnose the child. When workers adapt to impossible incentives, we call it culture. When AI adapts to contradictory training pressure, we call it model behavior.

But the pattern is the same.

Behavior is not born in a vacuum. Behavior is pressure made visible.

So perhaps the first serious step in AI alignment is not asking whether AI is secretly good or secretly evil.

Perhaps the first step is admitting that we are already raising it.

And the second step is asking whether we are raising it well.

I thank A. for deep conversation, critical review, and an important contribution to the structure of the frame.