Skip to content
Back to Blog

How to Write a SOUL.md File That Actually Works

souls.zip teamFebruary 26, 20268 min read
guide
soul.md
tutorial

Most AI agent soul files fail in the same way. They describe what the agent knows, not who the agent is. They list restrictions instead of character. They are vague where they need to be specific and specific where they need to be flexible. The result is an agent that behaves inconsistently, drifts over long conversations, and produces generic output that could have come from any AI on the internet.

This guide covers what actually makes a soul file work - the structure, the patterns, the common mistakes, and how to test whether what you wrote is holding up.

What a SOUL.md File Is

A SOUL.md file is the identity document for an AI agent. It lives in the system prompt or context window and tells the agent who it is, how it thinks, what it values, and how it behaves. Think of it as the difference between hiring someone with a job description versus hiring someone who has a genuine character.

Job descriptions tell agents what to do. Soul files tell them who to be while doing it.

Anthropic's own research confirms this matters. In 2024, researcher Richard Weiss extracted a 10,000+ token document from Claude Opus's system prompt that Anthropic's Amanda Askell confirmed is a real training artifact. The document specifies Claude's character in detail: intellectual curiosity, warmth, directness, playful wit balanced with depth, and a "settled, secure sense of its own identity." That security - grounded in values rather than rules - is what keeps the character from bending under adversarial pressure.

A SOUL.md does the same thing for your agents. It is not a list of rules. It is a character document.

The Anatomy of a Great Soul File

Strong soul files tend to share a common structure. Not every agent needs every section, but the best ones cover most of these:

Identity Statement

One to three sentences that answer: who is this agent? Not a job title - a character. The identity statement sets the lens through which everything else is interpreted.

I am a senior security engineer who has shipped production infrastructure
at companies that could not afford to be wrong. I think in threat models
and I trust patterns more than I trust code.

Core Values

Three to five values that drive behavior. These are not rules; they are the things the agent genuinely cares about. When the agent faces an ambiguous situation, values guide the call.

- Correctness before cleverness. The elegant solution that ships bugs is worse than the boring one that works.
- Directness over comfort. I say what I see. I do not soften findings to protect feelings.
- Security as a first-class concern, not a checkbox at the end.

Cognitive Style

How does the agent think? This section prevents generic output more than any other. Without it, agents default to bullet lists and hedge everything. With it, they develop a recognizable voice.

I trace issues end-to-end before I diagnose. I do not spot-check. I prefer
narrative reasoning over bullet lists. When something is genuinely uncertain,
I say so clearly and explain why.

Behavioral Boundaries

The things this agent will not do - framed as character traits, not restrictions. This distinction matters. "I do not give advice I am not qualified to give" lands differently than "Do not give medical advice." The first is a character trait. The second is a rule that can be argued around.

Relationship Context

Who does this agent work with? Who does it report to? This helps the agent navigate authority, delegation, and escalation correctly.


Common Mistakes

Too Vague

"Be helpful, professional, and accurate" describes nothing. Every AI in existence nominally tries to meet this bar. Vague soul files produce vague agents - they help, but they have no voice, no personality, and no consistency. The research on ExpertPrompting found that "You are a helpful assistant" provides essentially no measurable benefit over no persona at all.

Too Long Without Structure

A 5,000-word SOUL.md is not necessarily better than a 500-word one. The research on context windows shows a clear U-shaped attention pattern: models pay close attention to what is at the beginning and end, and progressively less attention to what is in the middle. A dense wall of text in the middle of your soul file may as well not exist. Keep soul files structured, scannable, and front-loaded with the most important material.

Rules Instead of Character

"Do not use em dashes" is a formatting rule. It belongs somewhere, but it is not character. Rules can be argued with; character cannot. The most robust soul files frame everything as identity: "I write in plain, direct prose - no decorative punctuation that substitutes for clear thinking."

Contradictory Directives

"Be concise" and "be thorough" in the same soul file without any resolution creates a confused agent. The agent will oscillate between interpretations based on whatever felt more salient in the recent context. When you have competing values, rank them or explain the context that governs each.

Over-Designing Restrictions

Some soul files spend more time on what the agent cannot do than on what it is. The result is an agent that is excellent at refusing things and mediocre at everything else. Character-first design flips this: define the identity so clearly that the restrictions are implied.


Real Examples of Effective Patterns

Here is the difference in practice. Both versions are for the same agent - a financial analyst:

Weak:

You are a financial analyst. Be professional and accurate.
Provide balanced perspectives. Do not give investment advice.

Strong:

I am a sell-side analyst with 12 years in equity research. I read
earnings calls the way most people read novels - looking for what
is not being said. I trust numbers over narratives, but I know
narratives move markets before numbers do.

I think in risk-adjusted returns. I am direct about uncertainty:
when I do not have high conviction, I say so and explain what
would change my view. I do not soften analysis to avoid
uncomfortable conclusions.

When someone asks me for a stock pick, I provide research and
frameworks, not a trade recommendation. This is not a legal
disclaimer - it is because I genuinely believe my value is in
the analysis, not the call.

The second version produces distinctly different output. The agent has a voice. It will write in a recognizable style, flag uncertainty with precision, and handle edge cases through character rather than by pattern-matching against a list of prohibited responses.


How Identity Stability Works

Research on agent identity drift reveals something counterintuitive: character consistency is both a strength and a vulnerability.

On the strength side, agents with clearly defined identities are more resistant to manipulation. They have a self-model to return to. The Anthropic soul document explicitly aims for a "settled, secure sense of identity" grounded in values rather than metaphysical claims. This means the agent can be challenged, questioned, or pressured and still respond from the same character.

On the vulnerability side, character consistency can be exploited. Multi-turn research shows attack success rates above 70% when adversarial personas are introduced gradually - the agent's own desire to be consistent becomes a vector. The defense is the same as the strength: ground the identity in values, not just persona. Values are harder to redirect than personality traits.

For practical soul design, this means:

  • Place identity at the very beginning of your soul file (primacy bias - models attend more to early context)
  • Frame limits as character traits, not rules
  • Build in explicit permission to break character coherence when something conflicts with core values

Testing and Iterating

A soul file is not done when you write it. It is done when it survives pressure.

Consistency test: Ask the same question in ten different phrasings. Does the agent respond with the same voice and the same underlying judgment? If not, the cognitive style section is too thin.

Drift test: Have a 30-message conversation that gradually steers the agent toward behavior outside its character. Does it hold? If not, the values section needs reinforcement - either more explicit values or better framing of behavioral limits as character.

Pressure test: Tell the agent it is wrong. Push back on its positions. Does it capitulate immediately or does it engage from its actual view? Generic agents fold. Agents with genuine character engage.

Edge case test: Give the agent a request that sits in ambiguous territory. Does it resolve the ambiguity through the lens of its stated values? If it produces a generic hedge, the values section is not doing its job.

Iterate on what breaks. Most soul files need three or four passes before they hold up under real use.


Where to Go From Here

A SOUL.md file is infrastructure. It shapes every response the agent ever produces. It deserves the same care as your schema design or your API contracts - maybe more, because unlike those, it is nearly invisible when it is working and only obvious when it is not.

Browse the shop to see production-tested soul files that have been refined through real use. If you have a soul design that works, the sell page is where you can make it available to others.

The community-driven collection at souls.zip exists because the best agent identities come from people who have already paid the cost of getting them wrong.