[Hebbian Natural Abstractions] A Mathematical Foundation

Shared with: Nele, Konstantin, Jan, Philipp
Next to-dos:
Work on the connection between Oja’s derivation and PCA.
Include Jan’s derivation for the maximization of mutual information.
Rewrite interpretation, biological plausibility
Derive solution that the eigenvalue found by Oja’s rule is also the largest and thus the first principal component.

TL;DR: We showed how Hebbian learning with weight decay can enable a) feedforward circuits (one-to-many) to extract the first principal component of a barrage of inputs and b) recurrent circuits to amplify signals which are present across multiple input streams and suppress signals which are likely spurious.

Short recap

In the last post we introduced the following idea:
  • We don’t have a way to formalize concepts and transfer them so that an alien agent understands them. Think back to the tree conversation - how would you describe a tree to an AGI?
  • Yet, we aren’t facing the same issues when communicating concepts between humans.
  • We concluded that this must have something to do with how the brain learns.


In this post, we introduce a learning rule that is (presumably) used by biological brains and connect it with the type of circuits that emerge in the brain under different types of input. This connection will serve as the mathematical foundation for our exploration of how the brain forms natural abstractions.
We will consider two scenarios in the brain: (a) how the brain learns in a "many-to-one" setup, where several neurons project onto a single neuron in another layer, and (b) how the brain learns in an "all-to-all" setup, where several neurons in a layer connect to each other.
The current post tries to make the derivations accessible, but still focuses on the mathematics. In the next post, we will discuss implications of our derivations by delving into neuroscience and related topics.

Definitions and notation:

Notation used in this post:
  • refers to the statistical average of a variable .
  • and refer to the activation of neuron i and neuron j. Without loss of generality, we will assume that the activity of all neurons centers around 0, so that
  • refers to the strength of the synapse connecting neuron i to neuron j. A higher weight implies a higher likelihood of neuron i triggering neuron j.
  • We define the output of single a neuron as .
  • We will use the delta notation for derivatives, i.e. . This means that from to , the weight changes with: .
  • For the derivations, we will switch into vector notation at some places. When we do that, we remove indices and make the letter bold.
Definition Natural abstractions: Abstractions are lower-dimensional but high-level summaries of the things ‘out there’. Often, we can find abstractions that are relevant ‘further ahead (causally, but also in other senses)’ for prediction. They are natural in the sense that we expect a wide selection of intelligent agents to converge on them (Wentworth, 2021).
Definition Hebbian learning: While there is a plethora of learning rules, with varying degrees of biological relevance and plausibility, most rules are derivates of the Hebbian principle: neurons that fire together wire together (attributed to Carla Shatz). Thus, an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell. The naive implementation of this principle () is unstable, i.e. the numerical values of the weights grow indefinitely (see Appendix). To resolve this instability, variations of Hebbian learning have been proposed¹.
Definition Hebbian learning with weight decay: For the sake of simplicity, we will focus on one of the simplest variants of Hebbian learning, Hebbian learning with linear weight decay:
Here, represents a time constant that regulates the speed of learning. is a scalar decay factor that controls the speed of weight decay. This rule has several key advantages:
  • the rule is stable, i.e. it avoids Hebbian runaway dynamics (see Appendix).
  • the rule is biologically plausible in that homeostatic downregulation of large synapses appears as a key mechanism for memory consolidation (Torrado Pacheco et al., 2021).
  • weight decay is important for training deep neural networks, where it acts as a natural regularizer for improved generalization (Xie et al, 2021).
The structural simplicity of the rule also allows for clean derivations in the rest of this article. We can also derive equivalent results with other rules².

The Setup

A classic framework for understanding information processing in the biological brain is the hierarchical processing framework. In this framework, sensory information enters the brain through the sensory organs (eyes, ears, taste buds, olfactory system, somatosensory system, …). This information then progresses through layers on the cortical hierarchy. At each step of the hierarchy, neural circuits process the sensory information and integrate it into a coherent whole with prior information. The further up in the cortical hierarchy a neuron lives, the higher-order the information that neuron processes will be.
Hierarchical processing in the ventral stream. (Manassi et al 2013)
Hierarchical processing in the ventral stream. (Manassi et al 2013)
While the classic framework has limitations³, it still provides a useful approximation of information processing in the biological brain. We focus on two abstract circuits that are ubiquitous throughout the classic framework:
  • Feedforward circuit: Given two distinct populations of neurons, how does the brain learn the appropriate neural projections from one population to the other? Generally, the feedforward circuit is a "many-to-one" setup, where several neurons project onto a single neuron in another layer.
    • notion image
  • Recurrent circuit: Given a population of neurons, how does the brain learn the appropriate connections of neurons within the circuit? Generally, we interpret a recurrent circuit as a "all-to-all" setup, where several neurons in a layer connect to each other. Even though in practice not all neurons connect with all other neurons, we can still apply the all-to-all setup, where most of the connection strengths are set to zero (see Ko et al., 2011 for some biological background).
notion image

‘Many-to-one’: Feedforward circuits - analysis

notion image
In this scenario individual neurons that receive multiple of inputs from another population of neurons (as is the case for pyramidal neurons in layer 2/3 of the cortex). Speaking in terms of information processing, each neuron faces the task of extracting “relevant” information from a barrage of synaptic inputs. So, the neuron has to prioritize some inputs over others, depending on its role in the circuit. This role emerges during early brain development in an activity-dependent fashion through the flexible self-organization of neural circuits (Kirchner, 2022).
We model the activation of the target neuron, , as the weighted sum over all presynaptic inputs, :
Given this characterization, we can plug the equation for the activation of the target neuron into equation (1) to arrive at a formulation of the weight dynamics in terms of presynaptic activity:
As we care about the connectivity of the circuit at the end of development, we can assume that the system is in steady state, i.e. , to arrive at
or, in vector notation,
Let’s equate with the covariance matrix of the inputs, , under the assumption that the average activity of the inputs centers around zero⁴.
We recognize that this equation is the eigenvector equation, i.e. we learn that the vector of synaptic weights, , should be an eigenvector of the covariance matrix, . Under reasonable assumptions, we can furthermore derive that will be proportional to the eigenvector of the covariance matrix with the largest eigenvalue (Oja, 1983; 1992).
In summary, a neuron in the feedforward circuit will learn to extract a principal component of its input:
Each red arrow indicates the extracted principal component that emerged through Hebbian learning of input weights.
Each red arrow indicates the extracted principal component that emerged through Hebbian learning of input weights.
We have arrived at the principal component by deriving it as the eigenvector of the covariance matrix. Interestingly, principal component analysis identifies the eigenvector corresponding to the largest eigenvalue of the covariance matrix as the projections of the input space that retain the largest amount of variance. In our case, this is the weight vector at the point of convergence.
In later posts, we will expand on our observations in this post, where we also show the connection between maximizing projected variance and maximizing mutual information. This connection will be crucial for understanding how natural abstractions can arise in the biological brain.

‘All-to-All’: Recurrent circuit - analysis

The second important component of the hierarchical processing framework is the recurrent circuit. In this circuit, individual neurons within a population interconnect, creating a network that allows information to pass back and forth between neurons. This type of circuit is important for tasks like memory and pattern recognition, where information from multiple sources is integrated and processed over time.
notion image
In the recurrent circuit we have to consider a quadratic number of possible connections, , rather than the linear number of connections from the feedforward circuit, . In particular, the rule for Hebbian learning with weight decay now becomes
Here we assume that the feedforward input into neurons and dominate the activity variables, and . This connects to the insight we gained in the last section: each neuron in the population receives a barrage of inputs and has to prioritize. Thus, the activity variables and depend on the inputs received from the previous layers. This assumption is biologically plausible, as sensory stimulation indeed dominates neural activity (Alenda et al., 2010; Stringer et al., 2019) and recurrent connections become more important in the absence of sensory stimulation (Litwin-Kumar et al., 2014).
When writing the Hebbian learning rule thus, we can again introduce a steady state assumption, i.e. , to investigate the circuit connectivity at the end of development:
or in matrix notation, . In the case of a recurrent circuit, the structure of the circuit ends up mirroring the correlational structure of the input. Each synaptic connection between two neurons corresponds to an entry in the covariance matrix. This means that the recurrent activity of the circuit will amplify signals that are present across multiple input streams, and suppress signals that don’t (and are thus likely spurious).

Limitations and up-next

The material in this post is by no means novel, and is (with many variations) well-established introductory material in computational neuroscience. Still, wrinkles in the above story continuously appear and many PhD theses are written about those wrinkles, so we stress that we do not aim to provide a full picture but only a first-order approximation.
In the next post, we want to list and explain (some of) the empirical evidence from neuroscience that corroborates the theory we outlined in this post.


Why is pure Hebbian learning biologically implausible?
See this graph, that shows, for 250 iterations with a learning rate of 0.1, how the output of the Hebbian learning algorithm behaves and develops:
notion image
Here are the plotted weight vectors , also for 250 Iterations:
notion image
This shows, that with enough iterations, the weights of neurons equipped with Hebbian learning grow explosively - there is no decay term that limits their growth.
In contrast, Hebbian learning with a linear weight decay term is relatively stable:
notion image


2) For a similar derivation for Oja’s rule, see this article.
3) For example, it neglects feedback connections and multimodality.
4) Note that we assumed above that the average activity of all cells is zero, which justifies us calling the covariance matrix. We leave it to the motivated reader to convince themselves that the derivation also works with non-zero average firing rates and an appropriate offset in the learning rule.



In other words: a simple neuron, equipped with the Oja’s learning rule, will naturally converge towards the first principal component of the input data, thus the learned weight vector will point in the direction of the largest variance, where the greatest amount of information lies.
The weight learns to encode the most correlated features of the input. What this means is that, when given inputs, the neuron learns to summarize the input data (reducing higher-dimensional concepts to lower-dimensional concepts) in a way that picks out related properties, while still maximizing information. This is valid for single output neurons, but it can also be shown that if several neurons are linked together, they will compete for the same principal components, and with some architectural tweaks, a larger network with Oja’s rule will converge to the principal components in descending order.
Knowing this, we can plausibly assume that intelligent agents can learn abstractions through a learning rule similar to Oja’s rule, and thus also the brain.

How biologically plausible is this?

You may ask, how plausible is it to assume that the brain learns similarly; or: how closely does this really resemble how the brain learns associatively? For example, why is a normalization needed?
It seems plausible that normalization reflects the competition for trophic factors of newly formed synaptic connections, between a cell and a target cell. These trophic factors are provided in limited capacity, leading to programmed cell death in the early development of the brain. Thus, there is a strong selection pressure on newly established connections . Similarly, since the weights of the connections leading to the target neuron in an ANN with Oja’s rule are implicitly normalized to 1°, weights have to compete for their share.
Thus, Oja’s rule is better suited to resemble the brain’s learning rule than pure Hebbian learning. Of course, as always, neuroscience is a mess of caveats: quite likely the brain learns differently. But we can assume that there seems to be some similarity between how the brain learns and how an ANN with Oja’s rule learns.
Proof for implicit normalization:
With we can see that has to be 1, since:
If that wouldn’t be true, then would be false. Thus, the weights in the steady-state solution are normalized to 1.

Scenario b:

In Oja’s rule, we can see that the change of weights depends on a decay term (). We say that this learning rule is suited for a ‘many-to-one’ scenario. Many neurons project on a single neuron. Now this neurons has to make sense of the input it receives and convey a meaningful output. We will derive a result that, at the point of convergence (this means that delta w is 0), the weight vector will become the first eigenvector of the covariance matrix of the inputs. This will become clear later.
But if we actually look at the brain, we can see, that neurons often form layers. In this scenario, many neurons are interconnected. Thus, we name this the ‘all-to-all’ scenario. These layer then project to single neurons in other layers, leading again to the ‘many-to-one’ scenario. Here we will derive a result that shows that, at the point of convergence, each weight between two neurons in a layer will be an entry in the covariance matrix of the whole layer.

One to many:


TL;DR: Variants of Hebbian learning change structure in the brain to resemble natural abstractions. In particular, those variants,
  • find the first principal component of the received input and encode it in the weight vector of all weights leading to that neuron. We show that this direction is the direction with the highest amount of information from the input dataset.
  • Variants of Hebbian learning and structures in the brain converge to characteristics that are reminiscent of natural abstractions, in the sense that a) they converge towards learning the first principal component, pointing in the direction with the highest variance of the input and thus, presumably, in the direction with the highest amount of information.
In this post, we want to do the following:
  • connect Oja’s rule and Hebbian learning as a biologically plausible learning rules with the natural abstractions hypothesis.
By connecting how the brain presumably learns with natural abstractions = this, we want to provide backing for the “natural” part of John Wentworth’s natural abstractions hypothesis. We want to show that we should expect a wide variety of cognitive systems (including biological brain) to converge on using natural abstractions. Similarly, this post this will serve as the mathematical backbone of the whole argument presented in this sequence.
In later posts, we want to further explore implications of this idea, by providing excursions into neuroscience and related topics.
In later posts, we want to further explore implications of connecting how the brain learns with natural abstractions. We do that by providing excursions into neuroscience and related topics.
Note: This post is quite mathy. We will provide an interpretation at the end of the post.


  • Often, we can find abstractions, lower-dimensional, high-level summaries of information, that are relevant “further ahead (causally, but also in other senses)” for prediction.
  • They are natural in the sense that a wide selection of intelligent agents are expected to converge on them.
  • Often summarized as “Cells that fire together, wire together.”, thus an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell.
  • This means if a neuron A causes another neuron B to activate, the weight between them is strengthened.
  • Mathematically described by , with (in vector notation) being the change in synaptic strength between x, the pre-synaptic input, and the post-synaptic output of the neuron y, with a small learning rate ( is the rate of change of the synaptic weight in regard to time).
  • A variant of the Hebbian learning rule that tries to minimize some physiological implausabilities (such as that in Hebbian learning, the weights grow indefinitely).
  • Mathematically formalized by (the change in the weights in regard to time is depended on the learning rate times the input vector times the output minus the forgetting term (composed of the squared output and the current weight configuration)). Main features of the Oja’s rule are, that it features a) implicit normalization of the weight vector (this means that the weight vectors length is equalized to one) and b) a forgetting term that grows proportional with the output of the neuron by squaring it, thus preventing unrealistic, unlimited growth of the weight vector.

As stated above, Hebbian learning for a single neuron can be reduced to . Thus, the change in weights between some input neurons and a given neuron depends on the learning rate , the firing rates of the input neurons , and the output of the given neuron .
Unfortunately, this simple algorithm is physiologically implausible, primarily because with this, the numerical values of the weights grow indefinitely (see Appendix for visual explanation). This leads to Oja’s rule, as a variant of Hebbian learning, introduced as:

General Remarks on Variance

Now, if we set this equation to zero, we find the ‘steady-state’ solution for the weight vector. This means that the change of the weight vector goes to zero. This is the point of convergence of the weight vector. In the following, I will show that this point of convergence leads to the weight vector pointing in the direction of the largest variance of the input data.
From information theory we know, that “largest variance” is usually synonymous with “the largest amount of information in the input”.
Let’s consider two cases that should make it clearer what we are talking about:
Take, for one, a dataset with everything that we have ever experienced. Basically, a large set of sensory inputs, including images of trees. Now, this dataset is pretty vast. Therefore, it makes sense for a system to “make sense” of this input sequence. Secondly, let’s take a dataset with lots of sensory data of trees. We’ve scanned several thousand examples of trees.
Now, how would this look like, if applied to what we are trying to show?
We said, that “largest variance” is usually synonymous with “the largest amount of information in the input”. This seems desirable for our first dataset, since we want to extract meaningful abstractions from the input. E.g. we want to find a concept of a tree, that when we perceive it, we can better decide whether something is a tree or not. We can do that by looking for a property that maximizes the amount of variance between trees and everything else, but minimizes it for trees, i.e. is the same for all trees, but different for everything else.
But if we look at our second dataset, we want to find properties of trees, that vary strongly between them. This makes sense, if we want to learn as much as possible about trees: we don’t care about the fact that every tree does photosynthesis, that doesn’t tell us a lot about trees. Instead, we care about all the properties that vary widely between them, i.e. branching patterns.

The Setup

Let’s look at the mathematical implementation of how Oja’s rule converges to finding the direction of the largest variance (as we will later see, that’s the first principal component of the input) by considering a simple, single neuron with several inputs:
Obligatory simple neuron model
Obligatory simple neuron model
Here, we can see that the neuron calculates the weighted sum of the inputs (x). With this, let’s introduce as
Now, when we put this into (1), we get:
Now, we want to look at the steady-state solution, thus the point of convergence, where the weight’s change is (this means, that when we train the neuron again, it won’t update it’s weight’s again). For this we want to average over , the input vector, and assume that the weights stay constant.
Furthermore, assuming that , we can equate with the covariance matrix, or second moment matrix, of the inputs, .
Since is a scalar, we can now substitute this with .
This should remind you of the typical eigenvector equation for a linear transformation. Namely, this shows that is an eigenvalue of and the weight vector is one of the eigenvectors of . It can be shown that Oja’s rule converges to the first principal component, i.e. the largest eigenvalue of the covariance matrix .

New derivation:

We want to show that the following equation is also sufficient for achieving a PCA analyzer:
This is equal to standard Hebbian learning with a decay term that depends on the squared output of the neuron. For that, we have the following setup:
notion image
So, let’s derive the steady state solution for this equation and see whether weight vector also encodes the direction with the highest variance here. We say that , so the left side is zero. Note that and imply that we look at the recent firing rates of and .
Now we assume that the weights stay constant while we average over both sides.
Assuming that , we can substitute with . So:
Thus, the weight between two neurons is dependen on the Covariance matrix divided by the variance of .