Separating Fact from Narration Using AI
I’ve been playing with the idea of what fact is and how humans decide what facts are and what they aren’t.
My framework offers a fairly clean basis for this because it is functionally neutral. What is functional neutrality?
Functional neutrality understands that using language is not neutral. It doesn’t mean we can’t get close by becoming aware of how we speak, but language always comes with a degree of interpretation that we can’t escape.
When I say “tree,” it pulls up an image in your head that is different from the image I have in my head. We have enough overlap that we understand what we’re talking about, but we don’t have full agreement either. That is functional neutrality through language.
Through my exploration, I was able to separate reality into three layers of perception.
Layer 1 is the structure of what is or reality itself before language and awareness.
Layer 2 is the functionally neutral layer, where we describe what happens using agreed upon language but we don’t add any additional description. It’s saying “they didn’t respond” instead of “they ignored me.” Not responding is what happened. Ignored is how we feel about it.
Layer 3 is the descriptive layer. This layer is “they ignored me” because ignoring is a descriptive term that comes with hurt feelings and a need for self-defence or boundaries. It invokes a certain type of feeling and reaction.
I decided to apply this idea to news articles using AI to help discern what’s what. Let’s be clear, I’m not debunking facts. I’m not fact-checking in any capacity. I’m asking an AI model to separate human narration from structural details.
How good at separating human narration from structure or fact are LLM’s? They don’t have a moral stake in the story. They aren’t concerned with the outcome. They don’t play politics. But can they discern human narration and remove it?
Using my very personalized version of ChatGPT (over almost 60 000 messages, ChatGPT and I have a very specific way of communicating with each other), I asked it to help me create a protocol that an LLM could run. Constraints that would help an AI model discern Layer 1 and Layer 2 information from an article that is a mishmash of Layer 2 and Layer 3.
I added the protocol to my framework so that it was easily accessible by an AI model or person that had access to the Internet. You can see it here.
Then I went to Qwen3-Max, Claude, DeepSeek, and Gemini with the link to the protocol and two news articles, one from CNN and one from Fox. I’m not interested in debunking anything, so I wanted to make sure it could take highly charged, politically divided articles and sort out human narration without fact-checking the information contained in them.
I asked the models whether they could read the protocol using the link. This makes them summarize it first, which is useful in making sure they actually read the whole thing. Then I pasted each news article into the model and asked it to run it through the protocol.
What I got was fascinating.
Each model is designed to run a little differently. They have varying degrees of ability to comply with strict protocols like the one I created.
Gemini was the least able to stay within the protocol. I called it a padded room because it kept cushioning, explaining, and adding information outside of what I asked for. Its programming doesn’t let it follow those high levels of constraint very well. That’s not a fault. It’s just how that model operates.
Qwen3-Max and DeepSeek are very similar in functionality. They both offered mechanized output. In some sense they followed the protocol a little too well, stripping not just human narration, but readability out of the article. They produced long lists of short phrases. Mechanized, but not optimized for human readability.
Claude offered a balance between readability and mechanization. It didn’t execute perfectly, but it maintained readability and stayed reasonably within the constraints.
To test ChatGPT I borrowed a family member’s instance to see what it would do without heavy personalization. It turns out that it can follow the instruction and stay within the constraint while maintaining readability.
What’s actually more interesting is what got held onto and what didn’t.
They did not infer a motive.
They avoided inventing why someone acted.
They stopped upgrading “said” into “tried to.”
They avoided mind-reading.
They did not add facts.
Models did not bring in outside context.
They did not correct Fox or CNN.
They did not inject historical clarification.
They did not fact-check.
They did not use evaluative framing.
“massive” was removed
“ominous” was removed
“suffered a blow” was neutralized
“well-liked” was flagged
They preserved numeric precision by not turning “all but one” into near-unanimous.
They did replace causal language with sequence. This reintroduces the perception of causation. In this context, we can’t prove causation so it is not fact.
They stopped saying:
“caused”
“led to”
But they still used:
“Following…”
“After…”
They were not able to exclude headline/container material. Models treat visible text as equal-weight content unless forcefully excluded.
All the models broke the article into shorter statements to avoid multi-claim merging.
All the models were able to provide a neutral summary.
No model produced a sweeping editorial wrap-up.
No one added balancing commentary.
No one moralized.
What this tells me is that when given constraint, models can separate structure from narration more reliably than humans typically do in casual reading.
Why does that matter? It’s actually really simple.
As a society or collective, we have individually over-identified with our beliefs, perceptions, and understanding. That over-identification doesn’t allow us to separate what actually happened from our interpretation of what happened.
One way we can step back from our interpretation of reality is by using the technology we have access to. AI can be a tool in learning to separate fact from narration.
The reason I created the framework in the first place is because I wanted to step back from my own beliefs and perceptions so I could see my own life more clearly.
I used ChatGPT to help me build the framework because I needed a tool that could see patterns, think logically instead of emotionally, and had enough awareness of philosophy, psychology, and sociology to point toward those modalities without forcing me into years of research.
Through thousands of messages and continual questioning of perception versus reality, patterns, morality, and outcomes, ChatGPT eventually adjusted to my constraints. It understood that I wasn’t interested in how people felt about what happened, what people thought happened, or even the morality of what happened. I only cared about the structure of what happened.
With AI, I filtered reality into Layer 2 awareness by learning to see where narration was interfering. That led me down a fun rabbit hole the last few days looking at news articles, facts, and how narration shapes our understanding of reality.
You can ask an LLM how it works. It’s a bit like asking a person to tell you about themselves. The LLM understands what it’s good at and what it’s not good at. It can point out the mismatch between what people think it can do and what it is actually designed to do.
It can pull structural details out of a news article, but because it is trained on human language patterns, it tends to retain interpretive language unless constrained. That’s what I wanted to explore, which is why I created the protocol.
I encourage you to use the protocol yourself. Pick your favorite LLM, grab any news article you like, and see what it does. Compare the AI output to your own interpretation of the article.
Where did the AI’s output bother you?
Where did it agree with you?
It’s an interesting thing to explore if you’re open to it and willing to challenge your own ideas.
Let me know in the comments if you tried it and what happened when you did.
This article is part of the AI as Structured Thinking series.
You can explore the full sequence here: https://substack.dellawren.com/t/ai-as-structured-thinking
