“Hebbian” Natural Abstractions in the brain

In the last post, I showed how a neuron equipped with a common variant of the hebbian learning rule converges to finding the first principal component of the input, thus encoding the most correlated features of the input data.
The purpose of this post is to a) provide a mechanical understanding of how abstractions form in the brain, b) show whether the brain actually learns in this correlation-based way and c) extend the notion of (natural) abstractions with further insights.

What is an abstraction?

I will spare you with the introduction of “why thinking in abstraction is one of the most important features in the development of the human species”, as I guess you will have read some summary of this or another.
Similarly, forming abstractions is a wanted process for sophisticated AI systems. Discussion about whether current large language models already have such properties is polarizing. You could say that something like LaMDA is sentient, while others can say that it’s just a perfect stochastic parrot, while you can then ask, whether this isn’t just all we do anyways, …
So let’s shove that pile of confusion aside and first talk about what an abstraction even is - so that we then can look for them in neural networks and the brain.
  • Higher-level perception often transcends the notion of perception. Perception is connected to concepts and abstractions (Hearing a rumbling noise might induce a mental representation of a helicopter). Furthermore, perception is not bound to perceiving something. I might have spontaenous connections of concepts.
The problem we are trying to tackle is “How can we organize all the sensory information we constantly receive, and make useful abstractions from it?”. This is relevant, because we constantly perceive things and associate them with higher-level concepts, like “The voice I am hearing is from my husband”. Concepts form the basis of our thinking, like numbers for mathematics.
Abstracting and compressing all this sensory input seems like a possible solution. We have an abundance of sensory information, from which we carve out a concept, by filtering out unrelevant information to our goal or end (”noise”). We only keep things that seem relevant for the thing we want to describe, e.g. “trees have differing branching patterns, but these vary quite strongly, so the concept of a tree does not have to have a specific kind of branching pattern.”. In philosophy, this problem is also often considered as “necessary and sufficient conditions” for calling something a tree or similar¹.
So in a way, we are approaching the problem from two sides. We are minimizing the amount of information lost in the abstraction, but at the same time maximizing the abstraction. We bundle things together to form the maximum amount of abstraction that is still useful. This process wholely depends on the goal we are pursuing - if we are a gardener, we might care about a more specific kind of dirt, while we might only care about the concept of “dirt”. The gardener might have a less abstract understanding of dirt, which might be too specific for a layman’s purposes.
This can be better understood when considering abstracting an apple.
notion image
We would loose to much information about the apple, when we just consider it round. But we would have too much information about the apple (which defies the purpose of compression, mentioned earlier), if we just take it as it is, and say its an Acklam Russet cultivar. In other words, we could say that we look for the sweet spot, at which we don’t loose predictive power. We are maximizing predictive power. This model will be relevant later on as well.
These abstractions then form the basis for the predictions our brain makes about the world model it inherits from this process. I think that there are two factors that influence the formation and granularity of abstractions:
a) Task-based: As we saw in our gardener example, we might form different abstractions based on our goals and ends, for which we require an abstract or specific world-model.
b) Exposure: I think the way the brain works, just exposure to things makes our abstractions more specific. If I just look at different kinds of dirt every day, I naturally learn to make the distinction between clay-based and sand-based dirt.
These two factors also show that abstractions can change variably. If I once know about different kinds of dirt, I could forget most of the distinction if I stop being a gardener.
Now, let’s try to find a working definition of what an abstraction is and apply it to the natural abstractions hypothesis.
An abstraction is a compression or induction of a general concept. It is found by bundling properties together and throwing away information in a way that is maximizing predictive power. This trade-off is determined by the exposure to the object and the goal setting in which the object is viewed.
An abstraction is a compression or induction of a general concept. It is found by bundling properties together and throwing away information in a way that is maximizing predictive power. This trade-off is determined by the exposure to the object and the goal setting in which the object is viewed.
Here maximizing predictive power is meant as minimizing the amount of information lost, but maximizing the amount of informationed gained about the item that we abstract from.
With Oja’s rule, a biologically plausible learning rule for the correlation-based part of the brain, we saw that the brain learns to decode the first principal component of the input it receives.
To support this hypothesis, we ought to give empirical evidence for it. For this, let’s look at some examples, where it’s plausible to assume that the brain actually compresses its inputs into abstractions.

What is hebbian learning?

Let’s first reflect on the principle of Hebbian Learning again, “what fires together wires together”. This principle implies that when two neurons are stimulated by the same form of stimuli, they wire together, e.g. if you reading this triggers certain neuron A in your brain, while this activation causes neuron B to activate, neuron A becomes more efficient in triggering an activation potential in neuron B. There is no teacher involved, that gives feedback on whether this is a good action to take, it just does it. We grow up with vague feedback from our elders, but continuing into adultship we loose access to feedback. And strangely, this just seems to work. We will see in the following examples what this leads to in the brain.

Topographic maps

A prominent example where the brain learns the correlational structure of the “world out there” are retinotopic maps. Retinotopy is described as the “orderly mapping of receptive field position in retinotopic coordinates in a brain region” . Let’s disentangle: receptive fields are two-dimensional regions in the visual space. They cause a corresponding neuron to spike (driving electrical signals to retinal ganglion cells . Retinotopy then means that receptive fields directly map onto neurons in the visual cortex. These mappings are established early on, through gradients of chemoattractive and chemorepellent molecules, but later refined through hebbian learning. In an abstract sense, retinotopic maps form because neurons that have a similar receptive field (receptive fields are usually overlapping) fire together, thus, they wire together, i.e. a neuron A that activates because it detected a stimuli in its receptive field and triggers an activation potential in neuron B as a consequence (maybe because the receptive fields are overlapping or are close to each other) becomes more efficient in doing so. Therefore, visual stimuli that occur together usually trigger neurons that have different, but overlapping, receptive fields.
Maybe you’ve seen these haunting figures, called homunculi, already. The larger the given body part, the more space it’s cortical area takes up.
Maybe you’ve seen these haunting figures, called homunculi, already. The larger the given body part, the more space it’s cortical area takes up.
Similarly, tonotopic maps show that sounds that are close to each other (in frequency) usually map onto adjacent regions in the brain. The somatosensory system (e.g. stimuli on the skin) maps receptors from adjacent regions on the skin to adjacent neurons in the brain. For the olfactory system, the spatial relations in the epithelium and olfactory bulb are correlated , called rhinotopy.

Place and grid cells

Other researchers were awarded with the Nobel prize for their discoveries of place and grid cells, proposing the “hippocampus as a cognitive map” . Place cells are neurons that become active when an organism enter specific locations in its environment. Gird cells are activated whenever the animals position coincides with “any vertex of a regular grid of equilateral triangles spanning the surface of the environment. Both types of cells are said to make up hippocampal cognitive maps, which encode spatial statistical regularities in the environment. They show that the brain learns about its environment in a way that recognizes similarities and dissimilarities, i.e. “Have I been here before?”.

Left vs. Right Brain

When skipping over the polarizing parts, it’s generally accepted that the brain’s hemispheres are functionally specialized and modular. This means that certain features in the inputs are generally processed in the same areas. E.g. language tasks are generally thought to be processed in the left hemisphere , while the right hemisphere is involved in emotional functions . This was also shown in experiments in various split-brain patients in the 1950s and 1960s which showed that different brain functions are lateralized in the brain.


All of this implies that neurons in the brain pick up on similarities and dissimilarities in the inputs, building abstractions based on certain features of the inputs (i.e. grid-based depiction of locations). Usually this means that adjacent regions or neurons are responsible for responding to the same stimuli, forming modular functional units that act together in a way to enable higher-order functions like speech, etc.
From this we can deduce that the brain features a distinct modularity that maps certain senses to adjacent regions in the brain. This preserves the spatial, temporal and characteristic relationships of the world out there and implies that the cortex learns, for different senses, the statistical regularities of the input signals. This is what I mean when I say, that the brain learns the correlational structure of the world “out there”.

Putting it all together

This all can be a bit confusing. So, let’s try to disentangle and reflect on what we are trying to show here.
“Hebbian” Natural Abstractions and the Tales of PCA
we showed that a biologically plausible learning rule (Oja’s rule) converges to finding the first principal component of the input data, i.e. if we go about our days and get a constant stream of input, we begin to find patterns, statistical regularities, in the data. In other words, we want to find the direction of the largest variance in the data. An example: when we want to summarize data on wine, then we might find out that wine with a darker color is often more acidic (I claim absolutely no competence here). We also see that wine is mostly varying on the “acidic”-axis. Thus, the “acidic”-features explains most of the information in the data, i.e. we take our sample of different sorts of wine and find out that they mostly vary on acidicness, so if we want to know as much as possible about our sample, we strive to know about the acidicness of our samples. Isn’t that cool? And the brain does that without our help, no hands!
With this post I want to provide more backing for this claim. We saw that the brain does several things:
  1. The brain seems to build modular regions of neurons in the brain, that pick up on a particular stimuli and encode the statistical regularities of that stimuli. E.g. the left hemisphere is focused on linguistic abilities.
  1. When it comes to perception, there is often a continuous mapping between the spatial dimension of the sensory organ and the region in the brain. Thus, adjacent neurons in the brain correspond to adjacent cells in the sensory organ, e.g. neurons that have overlapping receptive fields of the visual field.
  1. The brain builds several “maps” of its environment. For example, place cells fire based on being in a certain location or not. All these maps are based on different forms of stimuli (sounds, visual stimuli, etc.)
Now, the natural abstractions hypothesis can be used to formalize these notions:
  • We can imagine the input to the brain (a correlation-based neural network) as a sample from a joint distribution over several variables in the real world that are causally related (or not). E.g. we might inspect several types of trees and run through a forest.
  • The brain then learns a set of features and representations of the input. It learns to encode correlations, variances as well as invariances and independencies. We might learn that a certain type of tree often has a certain type of bark, or that some characteristic is the same for all of them, or does not depend on the type of bark, etc.
  • All of these correlations and characteristics define the natural abstraction of a “tree”. The correlation-based neural network learns high-level lower-dimensional concepts in the inputs it receives. Natural abstractions are useful for predicting and explaining what happens in the world and ideally closely correspond to it. Depending on how granular your world-model is, you might have more specific tree categories.
  • These abstractions are natural in the sense that we expect lots of intelligent agents to converge on them. Thus, abstractions are somehow grounded in the physical, causal, structure of the real-world. If this wouldn’t be true, then why should we expect lots of intelligent agents to converge on them?
In this way, the natural abstractions hypothesis can be seen as bridging low-level brain functions and higher cognitive function, such as reasoning and communication. We see that the brain builds several partial abstractions of its inputs and puts them together in a way that produces a coherent concept that is useful for predicting.
The brain is learning in an unsupervised/correlation-based way, but there probably is an instance in the brain that produces actions to take, based on the generated world-model. This unit is probably analogous with with “Thought Generator” in Steven Byrnes work. It’s basically a control unit that selects good actions to take, based on the learned world-model and current sensory input.

Measuring abstractions

There are arguably different kinds of abstractions and on different levels of accuracy.
Abstractions might differ along several axes: e.g. something can be logical abstraction, i.e. abstracting an argument to something like “If A then B”. It could also be a statistical abstraction, or some other form of abstraction. Then, abstractions can be on different levels of granularity, e.g. a gardener has to have some way of differentiating between kinds of dirt, while this concept isn’t useful for predictions I make.
I don’t think that we can compare abstractions using “good” if they have lots of predictive power or “bad” if they don’t. The brain does not have a way of changing these abstractions on its own will if they don’t pan out. It just updates it’s abstractions if a different pattern occurs, e.g. if I often have to deal with different kinds of dirt and I am sensitive to it, I might update my abstraction of “dirt” to be more fine-grained (no pun intended). Instead, we only learn to take better actions based on our world-model.
In the following I want to compare different kinds of abstractions used in theoretical computer science:

Kolmogorov complexity

Can be described as: “the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produces the object as output.”. Here, the computer program can be seen as the abstraction of the output, e.g. we can imagine an output such as “abcabcabcabcabcabcabcabc” or “woejfosjdfiubsdfsnefsjneflns”. We want to find a computer program that is representing this output in the shortest way possible, i.e. “write abc 8 times” or “write woejfosjdfiubsdfsnefsjneflns”.
The apparent thing here is that is makes sense to find abstractions of the output that describe the output in a meaningful way, i.e. it’s about finding patterns or shortcuts that equally well represent the output. In our case, saying “write abc 8 times” is shorter than “write abcabcabcabcabcabcabcabc”. Thus, “write abc 8 times” is a better abstraction than “write abcabcabcabcabcabcabcabc”

Mutual information

Is defined for two random variables as the amount of information you gain about one random variable if you observe another one. The connection to natural abstractions is rather clear. If you know the state of a presumed natural abstraction, how much do you know about the low-level event that the natural abstraction is predicting? While doing that, mutual information does not distinguish between different kinds of information, e.g. causal or statistical information, which might be relevant for natural abstractions.

Solomonoff induction

Now it gets interesting. Solomonoff induction can be defined by imagining a bayesian agent with unlimited processing power. Using Occam’s Razor and Bayesian reasoning, this agent is supposed to be the universal predictor using the absolute minimum amount of data. Solomonoff is basically an unbounded (thus, without a limitation of computing power) way of doing epistemology.
This imagined agent could do a lot more than we do. E.g. we forget a lot of information that does not seem relevant for predicting stuff we care about. We don’t remember every blue pixel of the sky we ever saw. But a bayesian agent with unlimited computer power and memory can always benefit from more information. It can just use its old policy, or use it to make a better prediction, if the new observation is not useful for prediction.
This is a worry I have with natural abstractions as well. I believe that the reason why we currently see a wide range of intelligent agents to converge on the same abstractions is not, that they are universal in a way. It’s because we all have computationally similar constrains. It’s computationally efficient to build abstractions, but if you don’t care about a bloated ontology, you could just not use abstractions at all and give everything a new name. If you have no reason to be computationally efficient because you get unlimited processing power and memory, you are likely gonna do that.


1) If you want to know more about this, be sure to check out Suspended Reason’s LessWrong Post on this.

Left-over material

Now, the natural abstractions hypothesis is a way to formalize the notion of four sub-claims:
  1. Abstractions are natural in the sense that we should expect lots of intelligent agents to converge on them.
  1. Following from this, abstractions are somehow grounded in the physical structure of the real-world. If they wouldn’t be, we should expect 1. to be true, because then they would be some kind of subjective thing, we don’t care about.
  1. High-level lower-dimensional summaries of things are the abstractions we use in day-to-day interaction: I can talk about an “apple” and you don’t need to know “which apple specifically” to know that I am talking about an “apple”.
  1. The real-world abstracts well - i.e “the information that is relevant "far away" from the system is much lower-dimensional than the system itself.”
With the empirical evidence provided in this post, I tried to show that one class of intelligent agents, humans, follows these principles. The human brain, selective on various stimuli, learns the abstractions or statistical regularities of the world out there. It converges to finding lower-dimensional summaries of the things we perceive all the time, e.g. place cells for locations we pass. We use these concepts on a day-to-day basis, and somehow we share most of them.
At the same time, our neurons have to find good abstractions. Useless abstractions are sorted out and disposed of. What makes an abstraction good is its degree of redundancy. We want abstractions that are clear and crisp, explaining different things. Ideally they should share as less information as possible with other abstractions.

Mechanistic understanding of unsupervised or auto-associative learning

1: We can also imagine it like this: If we think about all the different constellations (without common laws of physics) of molecules on earth, the possible versions are endless. But the space of possible worlds is limited by the laws of physics. This space is still too large for us to comprehend, so we try to predict from what usually happens, e.g. correlations and dependencies we observe. In other words, we constantly predict future sensory data given our current motor actions and our learned world-models . As suggested by , we can extend this correlation-based model using a controller unit that is trained to take the best actions, given its visual representations and memory components.
As we interact with the world to achieve goals, we are constructing internal models of the world, predicting and thus partially compressing the data history we are observing. If the predictor/compressor is a biological or artificial recurrent neural network (RNN), it will automatically create feature hierarchies, lower level neurons corresponding to simple feature detectors similar to those found in human brains, higher layer neurons typically corresponding to more abstract features, but fine-grained where necessary. Like any good compressor, the RNN will learn to identify shared regularities among different already existing internal data structures, and generate prototype encodings (across neuron populations) or symbols for frequently occurring observation sub-sequences, to shrink the storage space needed for the whole (we see this in our artificial RNNs all the time).