Prior → Posterior
Bayesian
Update
Architecture of Mind · II

The
Predictive
Brain

Perception is not a window. It is a hallucination that happens to be controlled.

What you see, right now, is not the world. It is your brain's best guess about what is causing the signals arriving at your senses, a guess assembled from prior expectations, weighted by confidence, continuously revised against the error between what was predicted and what arrived. The experience of reality is a generative model running inside a sealed vault. The brain has never seen the sun. It has only ever seen electrochemical signals in absolute darkness, and from those it has constructed everything you have ever perceived. That construction is not a metaphor for perception. It is perception.

This is the predictive processing framework, arguably the most important theoretical development in cognitive neuroscience of the past three decades. It dissolves the traditional boundary between perception and action, between self and world, between cognition and emotion. It offers, for the first time, a unified computational account of how a brain produces a mind. And it reveals something profoundly strange: the confident, vivid, continuous experience of being present in a real world is, in a precise technical sense, a controlled hallucination.

Section 01

The Revolution
Nobody Noticed

In 1860, Hermann von Helmholtz published the third volume of his Handbuch der physiologischen Optik, a comprehensive treatment of the physiology and physics of vision. Embedded within it was an idea so radical that it would take a century and a half of neuroscience to fully appreciate. Helmholtz proposed that perception is a process of unconscious inference.

The sensory input available to the brain, Helmholtz argued, is radically underdetermined. The retinal image (a flat, inverted, two-dimensional projection) does not contain enough information to uniquely determine its three-dimensional source. Every visual scene is compatible with an infinite number of possible physical configurations. And yet we see the world as singular, stable, and three-dimensional, with immediate and largely automatic perceptual decisions about depth, lighting, object identity, and spatial position.

The brain resolves this ambiguity, Helmholtz argued, not by passive registration but by active inference. It brings prior knowledge (a model of the world built from past experience) to bear on current sensory data, selecting the most probable interpretation. This inference happens unconsciously, automatically, and at extraordinary speed. What appears in consciousness is not the evidence but the conclusion.

The objects in space appear to us clothed with the qualities of our sensations. Our sensations are only signs for the qualities of the external world, and it is left to our intelligence to learn how to interpret these signs through experience.

Hermann von Helmholtz: Handbuch der physiologischen Optik, 1867

The idea did not disappear after Helmholtz. It went underground. Behaviourism (which dominated psychology through the mid-twentieth century) had no room for internal models or inferences. The brain was a stimulus-response machine, and the suggestion that it was actively modelling the world was inadmissible. When cognitive psychology reasserted the legitimacy of internal representations in the 1960s, Helmholtz's framework re-emerged, but it was Erwin Schrödinger (in his neglected 1956 essay Mind and Matter), Richard Gregory (in his 1970 account of perception as hypothesis), and Jerome Bruner (in his work on perceptual readiness) who kept the inferential tradition alive.

The computational revolution that would give the framework its modern form began in earnest with two contributions. David Marr's 1982 book Vision established the principle that understanding a cognitive system requires analysis at three levels: the computational (what is the system doing and why?), the algorithmic (how does it do it?), and the implementational (what physical substrate instantiates the algorithm?). And Rao and Ballard's 1999 paper, "Predictive Coding in the Visual Cortex," provided a specific, mathematically tractable neural implementation of Helmholtz's unconscious inference, one that made empirically testable predictions about the structure and response properties of neurons in the visual hierarchy.

What Helmholtz had intuited in 1860 now had a mechanism. What that mechanism suggested about the nature of experience was genuinely astonishing.

Section 02

Karl Friston and
the Free Energy
Principle

Karl Friston at University College London is, by citation count, one of the most influential neuroscientists alive. His free energy principle (developed across a series of papers beginning in 2005) is either the most important theoretical unification in the history of neuroscience or an unfalsifiable mathematical framework whose scope has been inflated beyond what the evidence supports. Both of these positions are held by serious scientists. What is not in dispute is the ambition of the claim or the precision of its formulation.

The free energy principle begins not with neuroscience but with physics and information theory. A biological system that persists through time must maintain itself within a bounded range of states. A fish cannot spend time on dry land. A human body cannot sustain a core temperature of 45°C for more than minutes. Every living system has, implicit in its existence, a set of states that are characteristic of it, the states it must remain in to continue being itself. Friston calls the probability distribution over these characteristic states the organism's phenotypic prior.

The problem for any organism is that it cannot directly observe its own internal states, nor the external causes of its sensory signals. Both are hidden. What the organism has is sensory data (a stream of noisy, ambiguous signals) and a generative model: an internal representation of how its sensory signals are caused. The job of the brain, in this framework, is to make its generative model as accurate as possible, so that it can predict (and therefore control) its sensory inputs.

Variational Free Energy: Friston's Core Formulation
F = EQ[log Q(θ) − log P(o, θ)]
= KL[Q(θ) || P(θ|o)] − log P(o)
= Complexity − Accuracy

Equivalently: F ≥ −log P(o)
Variational free energy is an upper bound on surprise
Free energy F is a measure of the discrepancy between the brain's internal model Q(θ) (its beliefs about hidden causes) and the true generative process P(o, θ). How observations are actually produced. Minimising free energy means making the model's predictions match the sensory data as well as possible, while keeping the model as simple as possible. The brain never has access to the true probability of its observations, log P(o), which Friston calls "surprise" or "surprisal." But it always has access to an upper bound on that surprise: the free energy. So it minimises free energy as a tractable proxy for minimising surprise.
Q(θ) = The brain's approximate posterior, its best current guess about hidden causes
P(o, θ) = The true generative model. How observations o actually arise from hidden states θ
KL[·||·] = Kullback-Leibler divergence, information-theoretic distance between distributions
Surprise = −log P(o). How unexpected the observations are under the model

This is an extraordinarily compressed statement. What it says, in less formal terms, is this: the brain maintains a probabilistic model of what is likely to be causing its sensory inputs. That model generates predictions. Sensory data either confirms those predictions (low surprise, low free energy) or violates them (high surprise, high free energy). The brain is continuously working to reduce the surprise, to make the world, from its internal perspective, as expected as possible.

There are exactly two ways to reduce surprise. The first is to update the model so that it generates better predictions, this is perceptual inference, and it corresponds to learning and belief revision. The second is to change the sensory input by acting on the world in ways that fulfil the model's predictions, this is active inference, which is what Friston calls action. In this framework, perception and action are not separate faculties with separate neural implementations. They are two strategies for achieving the same goal: the minimisation of prediction error.

Karl Friston: Scope and Limits of the Free Energy Principle

Friston's framework is explicitly normative rather than descriptive. It does not claim that all brains do in fact minimise free energy in the mathematical sense. It claims that any system that maintains itself in a bounded set of states over time must, in some sense, be doing something equivalent to free energy minimisation, because a system that allowed its sensory surprise to grow unboundedly would rapidly leave its characteristic states and cease to exist.

This is a controversial move. Critics (including Colombo, Colombo & Iber, 2018; Klein, 2018) argue that the principle is so abstract that it becomes unfalsifiable, it applies to any self-maintaining system, including thermostats. Friston's response is that the value of the framework is not falsifiability at the level of the principle but the generation of specific, testable models at the level of neural implementation. Whether this defence is adequate remains genuinely contested. What is not contested: the predictive processing models the principle motivates make specific predictions that have repeatedly been confirmed by neural data.

Section 03

Predictive Coding:
The Architecture

Predictive coding is the neural implementation of the inferential framework, a specific account of how the brain's cortical hierarchy is organised to achieve prediction error minimisation. It is the mechanism by which Helmholtz's unconscious inference and Friston's free energy minimisation are instantiated in the biology of the nervous system. The 1999 paper by Rao and Ballard, "Predictive Coding in the Visual Cortex," gave this mechanism its modern form, and subsequent work by Friston and colleagues has extended it to the entire brain.

The Cortical Hierarchy

The cerebral cortex is hierarchically organised. In the visual system, information travels from the retina to V1 (primary visual cortex), then through V2, V4, V8, and on to regions in the temporal and frontal lobes that represent increasingly abstract, high-level properties of visual scenes: edges, textures, objects, faces, scenes, meanings. This has traditionally been understood as a feedforward hierarchy: raw sensory data enters at the bottom and is progressively processed into higher-level representations.

What predictive coding proposes is that this hierarchy runs in both directions simultaneously, and that the descending, top-down connections carry predictions, while the ascending, bottom-up connections carry prediction errors.

The Predictive Coding Loop
Higher level → generates prediction → sends DOWN to lower level
Lower level → compares prediction to actual input
Lower level → computes prediction error (residual)
Prediction error → sent UP to higher level
Higher level → updates its representation → generates new prediction
──────────────────────────────────────────────
What "travels upward" in the cortical hierarchy is not the sensory signal.
It is the difference between the signal and what was expected.

This inversion of the traditional feedforward model has a striking implication: what we normally think of as sensory processing (the transmission of information from the world into the brain) is actually the transmission of error signals. The cortex is not a perception machine that builds up representations of the world from incoming data. It is a prediction machine that sends models downward and receives errors upward, continuously updating its generative model to make the errors smaller.

The Two Cell Types: Prediction and Error

Rao and Ballard's model, elaborated by Friston, makes a specific claim about cortical cell types that has received empirical support. The hierarchy should have two functionally distinct populations of neurons at each level. Prediction neurons (proposed to correspond to the deep pyramidal cells in cortical layers 5 and 6, which project downward via long-range connections) send the model's predictions to lower levels. Error neurons (proposed to correspond to the superficial pyramidal cells in layers 2 and 3, which project upward) carry the prediction errors back up.

The prediction neurons are always active. They are the brain's running hypothesis about the world. The error neurons are silent when predictions are accurate, they have nothing to report. They fire when predictions are wrong. In a cortex that predicts well, most error neurons are quiet most of the time. Consciousness, in this framework, may correspond to the pattern of residual prediction errors that the hierarchy has not yet resolved.

Precision: Confidence Weighting

The predictive coding framework includes a second, crucial parameter: precision. A prediction error by itself does not tell the brain how much to update its model. A very uncertain prediction generates noisy error signals that should be discounted. A very confident prediction generates reliable error signals that should drive substantial model revision.

Precision is the brain's estimate of the reliability (the inverse variance) of its predictions and its sensory signals. It determines the weighting of prediction errors in driving updates. Crucially, precision can itself be modulated: the brain can adjust how much weight it gives to incoming sensory signals versus its current expectations. This modulation is proposed to be implemented, in part, by the neuromodulatory systems (dopamine, acetylcholine, noradrenaline) which are perfectly positioned anatomically to act as gain-control mechanisms across the cortical hierarchy.

Precision-Weighted Prediction Error
Δμ = κ · Π · ε
= learning_rate × precision × prediction_error
The update to the brain's current best estimate (μ) is the product of three terms: a learning rate (κ), the precision weight (Π (the estimated reliability of the prediction error signal), and the actual prediction error (ε) the difference between what was predicted and what arrived). High precision means prediction errors are weighted heavily and drive large updates. Low precision means errors are discounted and the current model is maintained. The brain is continuously estimating not just what is happening but how reliable its estimate of what is happening actually is.
μ = The current posterior estimate, the brain's best guess
Π = Precision, the estimated inverse variance of the prediction error
ε = Prediction error, signal minus prediction
κ = Learning rate. How fast beliefs update
Section 04

Anil Seth and
the Controlled
Hallucination

Anil Seth at the University of Sussex has, more than any other contemporary scientist, translated the predictive processing framework into a coherent account of consciousness. His 2021 book Being You argues that conscious experience (perception, emotion, the sense of being a self) is a form of controlled hallucination: a generative model of the world and of the self that is kept under control only by its continuous calibration against sensory prediction errors.

The word "hallucination" is chosen deliberately. The experiences of psychosis (seeing things that are not there, hearing voices, forming beliefs wildly disconnected from evidence) represent a predictive processing system that has lost calibration: one whose top-down predictions are so strong, and whose precision-weighting of incoming sensory signals is so low, that the model runs essentially unchecked, generating experience without adequate error correction. What Seth observes is that this is different in degree, not in kind, from normal perception. Normal perception is a hallucination that happens to be controlled by sensory evidence. Hallucination is a perception that has slipped its moorings.

We're all hallucinating all the time. When we agree about our hallucinations, we call it reality.

Anil Seth: Being You, 2021

The Beast Machine

Seth's second major contribution is his account of the embodied self. What he calls the "beast machine" hypothesis, drawing on Antonio Damasio's somatic marker work. The conscious experience of being a self, Seth argues, is primarily a model of the body. Not a model of the body as an object in space, but an interoceptive model, a continuous prediction of the body's internal physiological states, its drive states, its homeostatic needs.

Interoception (the sense of the interior of the body, including heart rate, respiration, hunger, temperature, visceral tension) is itself a predictive process. The brain maintains a continuous generative model of its body's internal states, generating predictions and comparing them to afferent signals from the body's interior. Interoceptive prediction errors (discrepancies between the predicted and actual state of the body) give rise to emotions. Anxiety is a prediction of threat to the body's integrity. Hunger is a prediction that caloric replenishment is needed. Joy may be a signal that the world is cooperating with the organism's predictions better than expected.

This is not phenomenological speculation. There is substantial evidence that interoceptive prediction errors (particularly mismatch between predicted and actual heartbeats) are measurably associated with the intensity of emotional experience. Studies using the heartbeat detection task have shown that people with high interoceptive accuracy (who can reliably detect their own heartbeats) report more intense emotional experiences, both positive and negative. The body is not the location of emotion. It is the primary data source from which the emotional experience is inferred.

Anil Seth: The Implications for Experience

If perception is a controlled hallucination, several uncomfortable implications follow. First: there is no view from nowhere. There is no perception that is not shaped by prior expectations, by the history of the model, by the current state of the organism. Every observation is theory-laden, not in the weak sense that background knowledge influences interpretation, but in the strong sense that what appears in experience is primarily the prediction, not the data.

Second: the sense of certainty attached to perceptual experience is not a guide to accuracy. The feeling that you are seeing the world as it is, rather than as your model predicts it to be, is itself a feature of the model, not a validation of it. The predictive brain is confident by default. Updating beliefs requires prediction error, which requires sensory signal, which the brain can actively suppress through precision modulation. A model that is very confident generates less prediction error, and therefore less drive to update, it becomes progressively harder to dislodge.

Third: consciousness is not a passive reflection of the world but an active construction, continuously assembled and revised. The seamlessness and stability of experience are achievements of the generative model, they are maintained by work. When that work fails, the seams show: in psychedelic states, in sleep paralysis, in psychosis, in the perceptual distortions of extreme fatigue or grief.

Section 05

The Evidence:
Illusions as Prediction Errors

The predictive processing framework makes a precise and testable claim: perceptual illusions are not failures of the perceptual system. They are the correct output of a well-functioning prediction system encountering a situation designed to pit prior expectations against sensory evidence. When an illusion fools you, the system is working exactly as it should, it is generating the most statistically likely interpretation of ambiguous input. The illusion is the prediction.

Visual Illusions: The System Working Correctly

The Müller-Lyer illusion (in which two lines of equal length appear different depending on whether their ends have inward or outward-pointing fins) has a predictive processing explanation of specific elegance. In three-dimensional environments, lines with outward-pointing fins tend to be closer to the observer (like the near corner of a room), while lines with inward-pointing fins tend to be further away (like the far corner of a room). The visual system has internalised this statistical regularity and applies it automatically. When you see the Müller-Lyer figure, your visual cortex is making a prediction consistent with thousands of encounters with three-dimensional environments. The prediction is wrong, but only because the stimulus has been engineered to exploit a prior that is accurate almost everywhere else.

Knowing that the Müller-Lyer lines are equal does not make them look equal. This is the key test. Prediction errors drive model update, but the model has been built over a lifetime of three-dimensional experience. A single consciously-apprehended correction cannot overturn it. The top-down knowledge that the lines are equal is represented at a high level of the hierarchy. The bottom-up error signal at lower visual levels continues to trigger the depth-based inference. The illusion persists because the model at the level that generates it has not been updated.

The Rubber Hand Illusion

The rubber hand illusion, first described by Botvinick and Cohen in 1998, offers a particularly clean demonstration of predictive body modelling. A participant sits with one arm hidden behind a screen. A rubber hand, visibly positioned where their hidden hand might plausibly be, is stroked synchronously with the hidden hand. Within minutes, most participants begin to feel that the rubber hand is theirs, they experience a sense of ownership of the rubber hand, and when a hammer is raised to strike it, they flinch and withdraw their hidden hand.

In predictive processing terms, the brain is maintaining a generative model of its body, a model that predicts the location and state of each body part based on proprioceptive, tactile, and visual signals. When the synchronous stroking creates a reliable statistical correlation between the visual experience of the rubber hand being touched and the tactile experience of the hidden hand being touched, the most parsimonious explanation of all these signals is that the rubber hand is the participant's hand. The model updates accordingly. The sense of ownership is the prediction, and it is vivid and behaviorally compelling.

The McGurk Effect: Cross-Modal Prediction Integration

The McGurk effect, described by McGurk and MacDonald in 1976, demonstrates that speech perception is a multimodal predictive inference rather than an auditory process. When an audio recording of "ba" is dubbed onto a video of a mouth articulating "ga," most observers hear neither "ba" nor "ga" but "da", a phoneme that the brain has generated as the most plausible cause of the combined visual and auditory signals. The perceived phoneme is not present in either sensory stream. It is the output of a multimodal generative model.

Like the Müller-Lyer, the McGurk effect does not disappear when you know it is happening. Even participants who are told about the effect, who know the audio says "ba," continue to hear "da" when watching the mismatched video. Knowledge does not override the generative model. The model operates below the level where propositional knowledge is represented.

Phantom Limb Pain: Prediction Without Correction

Phantom limb pain (the experience of pain in a limb that has been amputated) is perhaps the most striking clinical demonstration of prediction overriding reality. The brain maintains a model of the body that includes the missing limb. That model continues to generate predictions about the limb's state. Without sensory feedback from the limb to generate prediction errors, the model cannot update. In many cases, the phantom limb is experienced as locked in a painful position (a clenched fist, a twisted joint) that the model predicts but the actual limb cannot correct.

V.S. Ramachandran's mirror box therapy (which creates a visual illusion of the missing limb moving freely and painlessly) works, on the predictive processing account, by providing a visual input that generates prediction errors capable of revising the frozen body model. The visual signal of the reflected arm moving generates a top-down expectation that is incompatible with the frozen phantom, and the conflict drives model revision. Pain is reduced. In some cases, the phantom itself fades. The experience of the body is the model. Change the model's inputs, and you change the experience.

Section 06

Precision Weighting
and Psychopathology

If the healthy brain is a well-calibrated prediction engine (one that appropriately weights sensory signals against prior expectations, updating its models with accuracy proportional to the reliability of each source) then mental illness may, in important cases, be understood as a disorder of that calibration. Specifically: as aberrant precision weighting. This is not a metaphor. It is a computational claim with testable consequences.

Schizophrenia: Aberrant Precision and False Inference

Schizophrenia presents, on the predictive processing account, as a disorder of excessive precision assigned to internally-generated signals and insufficient precision assigned to sensory data. When the brain's self-generated predictions are treated as highly reliable and incoming sensory data is systematically discounted, the generative model runs with insufficient error correction. High-level predictions propagate downward without being checked against sensory reality, generating experiences (auditory hallucinations, formed visual hallucinations) that have the phenomenal character of veridical perception because, from the model's perspective, they are highly confident predictions with no competing error signal.

The delusion formation that accompanies psychosis fits this framework precisely. A delusion in this model is not a "false belief" in the naive sense, it is the most plausible inference a system can make when it assigns very high precision to its own predictions and very low precision to disconfirming evidence. Once a high-precision prior is established, it is self-reinforcing: evidence that contradicts it is assigned low precision (because it conflicts with a high-confidence prediction), and therefore does not drive update. The rational architecture produces irrational outcomes.

Crucially, this framework predicts that the positive symptoms of schizophrenia (hallucinations, delusions) should be associated with elevated dopaminergic activity, because dopamine is proposed to encode the precision of prediction errors, particularly in the mesolimbic system. Excessive dopamine would inflate the weight given to certain prediction errors, making them drive update even when they should be discounted. This fits the well-established clinical observation that dopamine antagonists (antipsychotic medications) reduce positive symptoms. It also fits the observation that drugs that elevate dopamine, such as amphetamine, can produce psychosis in healthy individuals.

1%
Lifetime prevalence
of schizophrenia
globally
30%
Of schizophrenia patients
treatment-resistant to
dopamine antagonists
~80
Milliseconds: onset of
predictive suppression
of sensory response

Depression: The Overconfident Negative Prior

The predictive processing account of depression is both elegant and disturbing. Depression, on this account, is not primarily an emotional state. It is a state of inference. The depressed brain has developed strong, high-precision priors that are negative: predictions that effort will not be rewarded, that social interaction will be painful, that the future will resemble a past that included loss, defeat, or failure. Because these priors are assigned high precision, the prediction errors generated by neutral or positive events (which should update the model in a positive direction) are systematically down-weighted. They cannot penetrate.

This creates the characteristic phenomenology of depression: the inability to enjoy things that were previously enjoyable (anhedonia is the failure of positive prediction errors to register), the cognitive rigidity (the failure to update on disconfirming evidence), the sense that negative outcomes are certain (high-precision negative priors dominate experience). It also explains a puzzling feature of depression: it often gets worse under conditions of forced positive social interaction, which generates strong positive prediction errors that the system interprets as evidence for the unreliability of its own positive predictions, paradoxically reinforcing negative priors.

Autism Spectrum: Precise but Inflexible Predictions

The predictive processing account of autism spectrum conditions, developed by Pellicano and Burr (2012) and elaborated by others, proposes a different calibration failure: not that the brain's predictions are wrong in a particular valenced direction, but that the system assigns uniformly high precision to prediction errors, that is, it updates on incoming sensory data very readily and with high weight. This produces an exquisitely sensitive but contextually rigid perceptual system.

High sensory precision means that stimuli that neurotypical brains habituate to (because reliable prediction errors become low-precision over time) continue to trigger strong responses. Sensory sensitivities, the discomfort with unexpected change, the intense focus on detail, and the difficulty with social prediction (which requires rapid updating of complex social models from ambiguous signals) all fit the pattern of a system that weights prediction errors too heavily to develop the smooth, flexible, contextually-sensitive prior structure that characterises neurotypical social cognition. Importantly, this is a difference in calibration, not in intelligence or in capacity for experience, it is a different operating regime with genuine advantages (accuracy, attention to detail, pattern sensitivity) and genuine costs (inflexibility, sensory overload).

⚠ Unsettled Science: Predictive Processing and Mental Illness

The predictive processing accounts of schizophrenia, depression, and autism described here are influential and increasingly supported by neuroimaging and pharmacological evidence. They are not, however, established fact. The mapping between computational parameters (precision weighting, prior strength) and biological mechanisms (dopamine function, synaptic gain) involves assumptions that are contested. The framework generates qualitative predictions that fit the clinical picture, but quantitative tests (predicting specific parameter values from measurable biology) are still largely aspirational. The framework also runs the risk of becoming unfalsifiable if "precision weighting" is invoked post-hoc to explain any observed difference in sensory processing. Researchers in this space are aware of these risks; the honest position is that these are the most coherent computational accounts currently available, not confirmed mechanistic explanations.

Section 07

Active Inference:
Acting to Confirm

The most counterintuitive claim of the free energy framework is its account of voluntary action. In the traditional picture, the sensory system receives inputs, the motor system produces outputs, and the two are mediated by a decision-making process somewhere in between. The brain is a stimulus-response machine, with varying degrees of sophistication in the middle.

In the active inference framework, this picture is inverted. Voluntary action is not the output of a decision. It is the mechanism by which the brain fulfils its own proprioceptive predictions. The motor cortex does not send commands to the muscles to produce movement. It sends predictions about where the limbs should be, and the muscles (via classical reflex arcs) act to fulfil those predictions by reducing the discrepancy between where they are and where they are predicted to be. The brain reaches a hand toward a cup not by commanding the arm but by predicting (with high precision) that the hand is at the cup. The arm's task is to make that prediction true.

Active Inference: Action as Prediction Fulfilment
Action: ȧ = −∂F/∂a
Motion minimises free energy through sensorimotor channels

Proprioceptive prediction error → reflex arc → movement
The motor system acts to fulfil proprioceptive predictions
The gradient of free energy with respect to action (a) specifies how the organism should move to reduce its sensory surprise. This is mathematically equivalent to the organism acting to fulfil its proprioceptive predictions, its predictions about where its body should be. Movement is the process of making the world conform to the brain's model, rather than making the model conform to the world. Perception does the latter. Action does the former. Both minimise free energy.
ȧ = rate of change of action
∂F/∂a = gradient of free energy with respect to action. How free energy changes as action changes

Attention as Precision Allocation

Attention, in the predictive processing framework, is not a spotlight that selects certain items from a sensory field for further processing. It is the process by which the brain modulates precision, increasing the weight given to certain prediction errors and decreasing the weight given to others. Attending to something means increasing the precision of the predictions and prediction errors associated with it. This concentrates the brain's belief-updating resources on the attended domain.

This has a consequence that is deeply counterintuitive from the traditional view: attention is not primarily bottom-up, driven by the salience of incoming signals. It is primarily top-down, determined by the precision assignments of the current generative model. You attend to what your current model considers most uncertain, most relevant, or most threatening, not merely to what is loudest or brightest. The pop-out of a bright red object in a green field is real, but it is real because the prior model assigns high precision to that kind of colour-contrast signal as predictive of something important, not because bottom-up salience is an independent driver. Everything is inference all the way down.

Why We Seek the Expected

The active inference framework makes a prediction about motivated behaviour that is as sobering as any in the predictive processing literature. If organisms act to fulfil their predictions (to make the world conform to their model) then they should show a systematic tendency to seek out evidence that confirms their existing model rather than evidence that disconfirms it. Not out of bias or laziness, but as a structural consequence of free energy minimisation.

Confirmation bias, from this perspective, is not an error. It is the expected output of a system that is trying to maintain a stable generative model by seeking out environments in which that model makes accurate predictions. The human tendency to self-select into information environments that confirm existing beliefs, to associate with people who share one's worldview, to interpret ambiguous evidence in the direction of existing priors, these are not cognitive failures. They are the organism doing its job.

The practical implication is uncomfortable. Overcoming confirmation bias is not a matter of good intentions or critical thinking training alone. It requires what the framework calls epistemic foraging, actively seeking out environments that generate surprising prediction errors, that are designed to defeat the current model's predictions. This is cognitively and motivationally costly. The system does not want to do it. It generates free energy. It creates discomfort. This is, literally, the neurological structure of the resistance to having one's mind changed.

Section 08

The Self
as Prediction

If perception is a generative model, if action is the process of making the world fit the model, if emotion is interoceptive inference, then what is the self? The answer offered by the predictive processing framework, and most fully articulated by Anil Seth, is that the self is the brain's deepest, most embedded, most high-level prediction. The sense of being a continuous, bounded, embodied agent with a particular history and set of characteristics is the most fundamental inference the brain makes, and it is, like all inferences, a model that can be wrong.

The self-model has several components, each corresponding to a different level of the brain's predictive hierarchy. At the lowest levels: the bodily self, the continuous prediction that there is a body in a particular position, with particular states, that belongs to me. This is disrupted in out-of-body experiences, in depersonalisation disorder, and in the rubber hand illusion. At higher levels: the perspectival self, the prediction that there is a point of view, a here from which the world is perceived. And at the highest levels: the narrative self, the prediction that there is a continuous entity through time, with a particular character, who was the subject of past experiences and will be the subject of future ones.

The experience of being a self is a kind of controlled hallucination, a prediction, maintained by the brain, of what kind of thing it is to be you. When the control breaks down, the hallucination changes. Psychedelics, meditation, seizures, and extreme psychological stress all disrupt the self-model in ways that reveal its constructed nature.

Drawing on Seth (2021) and Friston (2010)

Psychedelics and the Dissolution of the Prior

Robin Carhart-Harris and colleagues at Imperial College London have conducted the most rigorous neuroimaging studies of psychedelic states available. Their findings, interpreted within the predictive processing framework by Carhart-Harris and Friston in their 2019 "REBUS" (Relaxed Beliefs Under Psychedelics) model, propose that classic psychedelics (psilocybin, LSD, DMT) act primarily by suppressing the precision of high-level priors. The top-down predictions that normally dominate the generative model are relaxed. The bottom-up prediction errors from sensory signals are allowed to propagate further up the hierarchy with less suppression.

The phenomenological consequences of this are exactly what the framework predicts. When high-level priors lose precision, the self-model becomes unstable (ego dissolution is the subjective experience of the brain's highest-level prediction losing its grip. Perception becomes richer and stranger) because lower-level prediction errors that are normally suppressed are now propagating to consciousness. Novel associations form (because the hierarchical suppression that normally constrains which lower-level patterns can influence higher-level representations is relaxed. And, importantly, the experience has a lasting quality of insight) because the brain, freed temporarily from the tyranny of its most confident priors, encounters prediction errors it would normally never experience, and cannot avoid updating on them.

Meditation: Training Precision Modulation

Contemplative traditions have, for centuries, developed practices that (described in predictive processing terms) train the practitioner to modulate the precision of their priors. Mindfulness meditation, in its basic form, involves attending to sensory experience as it is, rather than through the interpretive overlay of the habitual generative model. The instruction to "observe thoughts without identification" can be read as an instruction to notice predictions without treating them as high-precision facts. The instruction to "sit with discomfort" is an instruction to allow interoceptive prediction errors to propagate upward without immediately triggering action to suppress them.

Neuroscientific studies of experienced meditators show reduced activity in the Default Mode Network (the network associated with self-referential, narrative processing) during meditation. This is consistent with the predictive processing account: the narrative self-model is a high-level prior; its suppression during meditation corresponds to a state in which that prior is assigned lower precision. The experience of "no-self" or "selflessness" described in advanced meditative traditions is, in this framework, not a metaphysical claim but a phenomenological report about what it is like to run the brain with the self-model's precision turned down.

Section 09

Where the
Theory is Genuinely
Contested

The predictive processing framework has achieved a dominance in theoretical neuroscience that makes it easy to mistake for established fact rather than a powerful and productive theoretical programme. The obligation to intellectual honesty requires a direct account of where the framework is contested, where the evidence is weak, and where the theory's claims outrun its foundations.

The Dark Room Problem

The most persistent objection to the free energy principle is what Friston himself calls the "dark room problem." If organisms minimise prediction error, why don't they simply find the most boring, predictable environment possible (a dark, silent room) and remain there, where prediction errors are minimal? The expected organism, under strict free energy minimisation, would appear profoundly anhedonic and asocial.

Friston's resolution invokes the concept of epistemic value: the organism's prior expectations include not just what sensory states it expects to find, but what states it expects its species to occupy, and these species-level priors include active exploration and information-gathering. A human prior includes the expectation of social interaction, novel problem-solving, and varied sensory environments. The organism seeks novelty because its phenotypic prior expects novelty. This resolution is logically coherent but has been criticised for being circular: the framework can explain anything by positing the appropriate prior, and there is no independent way to specify what the "correct" prior for a given organism should be.

The Measurement Problem

The specific claims of predictive coding (about the functional roles of superficial and deep pyramidal cells, about the precision-encoding function of neuromodulatory systems, about the hierarchical organisation of prediction and error signals) are testable in principle but have proven difficult to test in practice. The spatial and temporal resolution of current neuroimaging techniques is insufficient to independently track "top-down prediction" and "bottom-up error" signals. Single-cell recording can do this, but only in animals, and primarily in early visual areas.

The evidence that does exist is largely consistent with predictive coding but is not uniquely entailed by it. Many findings that fit the predictive processing framework also fit alternative accounts, including simpler feedforward models with recurrence and attentional modulation that do not require the full theoretical apparatus of active inference and free energy minimisation.

The Intentionality Question

A deeper, philosophical objection has been raised by Patricia Churchland, Daniel Dennett, and others. Predictive processing theories describe the brain as "making inferences," "forming beliefs," "generating predictions" (in the language of intentional states, mental states with content. But the framework is formally stated in terms of probability distributions and information-theoretic quantities. There is a gap) possibly an unbridgeable one, between the mathematical machinery and the claim that the brain "really" predicts, "really" infers. Whether this is an objection to the framework or to our understanding of what it means for physical systems to have intentional states is a question that reaches into the philosophy of mind and has not been resolved.

Strongest empirical support

The suppression of neural responses to predicted stimuli compared to unpredicted ones is well-documented across species and brain regions. Prediction error signals in midbrain dopamine neurons are among the most robustly replicated findings in systems neuroscience. Multimodal integration (as in the McGurk effect) is most parsimoniously explained by a shared generative model. The psychopharmacology of psychedelics and its phenomenological consequences fit the REBUS model with striking precision.

Where evidence is weakest

The specific claim that all action is the fulfilment of proprioceptive predictions remains controversial among motor neuroscientists. The mapping between computational parameters and specific neurotransmitter systems is assumed rather than demonstrated. The account of consciousness (that prediction error residuals are what appear in experience) is not yet connected to the hard problem in a way that most philosophers of mind find satisfying. The "all the way down" universality of the framework has not been demonstrated outside of sensory cortex.

Red Thread

What This Changes

You do not perceive the world. You perceive your brain's best model of the world, continuously updated by the gap between what was predicted and what arrived. The model is built from your history, your body, your culture, your current drive states. It is weighted by confidence. It is self-stabilising. It resists revision in proportion to its own certainty. Knowing this does not dissolve the experience, the controlled hallucination continues, vivid and insistent. But it fundamentally reframes what the project of learning, of changing one's mind, of overcoming bias, actually requires.

The brain evolved to predict, not to perceive. The difference matters, because a perceiving brain is at the mercy of its inputs, while a predicting brain is shaped by its history, its body, and its accumulated expectations. Understanding this is not a consolation. It is a lever.

Next: III: Hypnosis from First Principles · The Narrator Loses the Pen