From Greek philosophers to modern-day neuroscientists, questions regarding the nature of human memory and experience have been at the forefront of great minds. While much remains a mystery, theoretical deductions and experimental setups over the last two centuries have elucidated [or shed light on] many aspects of the problem.
What is experience?
Human experience has a qualitative property that colours everyday life. If you were asked to imagine what it is like to be a baseball bat, you might not be able to imagine anything. We intuitively believe that a bat does not have that first-person, egocentric, qualitative characteristic to experience as a human does. Unlike the bat, there is such a thing as to be a human; we do not only interact with our environment, but we “feel” our environment. This is not just because humans are more complex than baseball bats; even other complex systems, which process inputs to produce behavioural or non-behavioural outputs, are intuitively thought to lack the privilege of “experiencing” events like a human. For example, a spectrometer can tell me what colour a beam of light is based on its wavelength, and so can a human. However, the human not only classifies the colour but “experiences” what red looks like. There is an ineffable quality that defines seeing the colour red. Trying to explain the experience of seeing red to a person who was born blind. It is impossible. Similarly, we do not simply “sense” that something is salty and then eat more of it for sodium homeostasis. We instead “taste” the saltiness, “enjoy” the flavour, and “want” more of it. Whether it is seeing colour, having an itch, or being in love, there is a subjective property to the human condition that can only be discovered through conscious experience.
Why is there such a feeling as to “be” in an experience? This mystery, dubbed the hard problem of consciousness, remains a much more mysterious phenomenon than memory is today. Throughout history, philosophers and theologians have pondered this question, putting forward numerous spiritual and materialistic models as solutions. Unfortunately, the heart of the issue is out of reach of the scientific method. It is impossible to test and measure subjectivity in an objective manner. The only way to proxy what someone is experiencing is to ask them and trust their answer. While not the best compromise, such methodologies have provided valuable empirical observations which intricately correlate the activities of the brain with conscious experience. These observations form theoretical constraints around the mystery of consciousness, pointing towards the fact that we do not experience the real world. Instead, we experience the internal representations of the world as simplified by the brain.
Humans report that they experience life as a series of multi-modal sensory episodes. “Multi-modal” is a commonly used phrase in neuroscience to differentiate sensory modalities of an experience. For instance, if vision is a modality and sound is another, then a video is a two modality system. The human experience is inherently multi-modal, consisting not only of the five classical exteroceptive senses but also a host of interoceptive (internal) senses, including thought, emotion, position in space, and more. Through the combination of self-reporting and brain imaging, scientific studies have found that each aspect of experience correlates with neural activity in specific brain regions. For example, the occipital lobe is dubbed the visual cortex due to the fact that its activity strongly correlates with reported or deduced changes in visual state.
If you zoom into that brain region all the way to a cellular resolution, you will find that different constellations of neuronal activity consistently correlate with different characteristics of vision, such as colour and objects.
Every object can be represented by a unique neural code distributed across some region of the cortex, and correlational data suggest that it is. Every dimension and modality of experience has been found to have a neural correlation. An entire multi-modal experience can thus be thought of as the unique activation pattern across the whole brain, and so Theoretically, every experience can be represented in the brain by a completely unique neural code, which varies in a somewhat systematic way from experience to experience, and from person to person. If person A were to somehow have the ability to measure and translate person B’s neural code, then they should be able to deduce every aspect of that person’s experience perfectly. The data thus implies a necessary theoretical constraint regarding the hard problem of consciousness: it must somehow, intricately involve the brain. While this may seem obvious to the modern student, it diverges from historically non-materialistic approaches to the question; what is experience?
In pop culture, there is a famous idea termed “simulation theory.” It puts forward the existential issue that we as humans have no way of knowing if we are living in the real world or in a simulation of a universe created by other beings. This was touched on in the movie “The Matrix,” where the protagonist finds out that his life is a lie inside a simulation and that his real brain and body are in some chemical vat in the middle of nowhere, hooked up to computers to maintain this illusion. While at the surface level, these are fun possibilities to think about, they actually touch on a very important neuroscientific principle. Even though the movie protagonist, Neo, was in a chemical vat, his conscious experience was that of the everyman: working a job in a city. That’s because the sensors hooked up to his brain were feeding his neural circuits that information. Based on these senses, his brain created an entire internal representation of a world, and it was inside that world that Neo lived, not inside the vat. In other words, the brain creates a model of the world based on its peripheral sensors and interacts with that model of the world. Experimental data strongly suggests that what we experience is much more correlated with the representational model created by the brain than by the true nature of real-world events themselves. For example, our retina is two-dimensional, and it sends two-dimensional input to the visual cortex, yet we experience space in 3 because the brain can infer the third dimension from many visual cues like shadows, relative size, previous knowledge, and more. It receives the compressed 2D data and tries to unpack that information back into three dimensions to the best of its abilities, sometimes making mistakes. Optical illusions are the best examples of how sometimes this process can be wrong, and when it is, our experience is that of the incorrect mental representation, not the real world image. Other times, different brains can see the same stimulus and create different internal representations. For example, the famous blue/god dress was seen as blue by some and gold by others. People on each camp were so passionate that what they saw was correct because inside their head, in their world, it really was blue, while inside another person’s head, it really was gold. Spectrometry reveals the dress to actually be blue in the real world, but that’s not what many experienced seeing. Optical illusions are not the only times our internal representations diverge from what is happening in the real world. In each instance, human experience is more correlated with these internal models rather than its objectively real counterpart. Another famous example is phantom limbs, where people feel tingles or itches in amputated limbs that are long gone. While the amputated limb does not exist anymore, the brain region that used to receive information from that limb does. When someone reports feeling phantom limb pain, neuroimaging shows activity in the corresponding brain region, further enhancing the correlation between brain activity and experience while also strengthening the proposition that we live in a self-generated simulation.
While you may think you live in the real world, you really just live inside an intracranial simulation based on data from the real world. It is the same idea as is “The Matrix,” except that we are not plugged into a computer; we are instead plugged into our peripheral sensors. The sensors are in the real world, but we are not.
If the brain is at the center of our experience, then memories of these experiences must also be intrinsic to the brain.
What is memory? A History of Theories
There is a dance between plasticity and memory/stability that every network must learn to do elegantly.
Memory is an integral part of nervous system functioning. It is the dynamic relationship between the synaptic malleability conferred by plasticity and the synaptic stability conferred by memory that makes the nervous system so effective at outputting adaptive behaviour. The former allows the incorporation of new relationships and information into our carbon software, while the latter allows us to use the past in service of the future and present. Memory is pivotal in defining who we are and how we react to situations. Even when these reactions seem instantaneous, our intuitions and personalities are rooted in past experiences. These past life events interact with our genetics to influence the strength of synapses in varying circuits, regions and networks which make us who we are.
Theories regarding the nature of memories have existed for a very long time. Many of these theories shared a theme, which was the idea that memory is stored as a long-lasting change in the same substance that produces perception and thought. As thoroughly discussed in the previous section, sensory details are analyzed by the brain to produce an internal representation of the world, and it is that internal world where our conscious experience resides. These internal representations are associated with a unique pattern of neuronal activity. The general idea is as follows: when a memory is remembered, for a flash second, the brain-wide neuronal activity pattern is identical to what it was like during the original experience being remembered. By internally retrieving a memory, you are choosing to ignore your external sensory information for a moment and instead recreate the brain activity that was present in the past, going back in time to that “internal representation” or that simulation of the world and reliving it. If we live in a simulation generated by our brain’s response to sensory stimuli, then think of memory as visiting and briefly living in a simulation that is not reliant so much on external senses but is internally generated retrieval of a simulation previously generated by external sensory information.
This idea goes back to the times of Plato and Aristotle but was most thoroughly characterized in 1904 by Dr. Richard Semon. In his theory, he introduced the term engram in order to describe the physical substrate of a memory in the brain. The basic idea was that since there are a group of cells activated at any singular time point in the brain of a human, and since these cells represent the experience that the person is having in an episodic, multi-sensory manner, then these same cells must later also represent the memory. In other words, the reactivation of the same cell population at future time points represents the experience where that cell pattern was first originally encoded. Richard Semon deduced that if this is the case, then it must mean that these cells undergo latent, off-line physical and molecular activity in order to become more connected to each other as a network. This leaves behind the engram, which we now know to be a network of neural units that have strengthened synapses amongst each other. Back at the time, Richard Semon was unaware of the process of LTP or LTD. He simply stated that there would be “primarily latent modification in the irritable substance produced by a stimulus” and that the cells would “form a connected simultaneous complex of excitations which, as such, act engraphically, that is to say, leaves behind it a connected, and to that extent, unified engram-complex.” This makes his theoretical deductions all the more impressive. When asked about molecular underpinnings of his theory, he stated that “To follow this into the molecular field seems to me…a hopeless undertaking at the present stage of our knowledge and for my part, I renounce the task”.
Due to these technological limitations, Semon’s engram theory went unnoticed for a very long time. This was until Dr. Donald Hebb proposed his postulate regarding LTP and LTD. This idea introduced how “neural substrate holding a memory” can undergo “latent modifications” after initial activation, such as to stabilize the synapses connecting cells of a memory engram. As described in the previous section, when the activity of cell A precedes the activity of cell B in a highly temporally correlated manner, a molecular cascade is triggered that results in more ion receptors and eventually more synapses between those neurons. Donald Hebb did not just think about two neuron systems (i.e. neuron A and B) but extended his paradigm by contemplating its implications at the level of an entire network. He reasoned that synaptic modifications should facilitate the formation of entire cell assemblies; networks of cells commonly co-activated together. In his words, these assemblies are “simultaneously active reciprocally connected cells.” These cell assemblies have certain properties; you only need to re-activate a proportion of them to activate the whole assembly, as the ones you directly activate will stimulate the rest via potentiated synapses. This property would be useful for memories, as one relatively small or discrete retrieval cue could trigger the retrieval of the entire memory representation. Similarly, the reciprocal nature of these assemblies means that the destruction or silencing of some of these cells would not cause catastrophic destruction of the entire memory representation. When Semon proposed his engram theory, he also stated that engrams must have these properties if they were to support a memory effectively.
This touches on an extremely important concept in memory research; population codes. Real neuronal networks are much more intricately connected than two neuron systems, resulting in a complicated, web-like structure of interactions. In more complicated systems, it is extremely important that we approach memory from a population perspective as opposed to thinking from a single cell perspective. Different experiences and states of being are represented by a unique activity pattern across the whole population, termed a population code. Importantly, two different population codes might have some individual cells in common, even if they represent two widely different experiences.
Looking back through a historical lens, it is easy to see many similarities between Hebb’s Cell assemblies theory and Richard Semon’s theoretical coining of the term “engram.” Today, a cell ensemble is used to generally refer to any co-active population of neurons, whereas a cell engram specifically refers to the population of cells that represent a specified memory. Importantly, an engram is not the memory itself but rather the physical substrate of memory. Experiments after Hebb’s postulate elucidated the molecular and physiological details of synaptic potentiation and found these augmentations in molecular and physical factors to be critical and necessary for memory formation. Others found that enhancing them, i.e. via enhancing NMDA mediated LTP, can result in strengthened memory representations.
Modern Data and Perspective
While this evidence was important in asserting the role of LTP and LTD in memory, it remained circumstantial. It was not enough to reject or strongly support Semon/Hebb’s theories with appropriate scientific rigour. Recently, advancements in molecular techniques have far increased the resolution with which scientists can manipulate the brain, allowing for manipulations at a cellular resolution like never before. A specific new technique, termed “IEG dependent Tet-off Cre-Lox systems,” allowed for scientists to “tag” only the cells that were active during the completion of a certain task within a certain time window. In other words, one can make it such that if a mouse is completing a maze, then the experimenter can label only the cells active during maze completion, and no other cells, by making them express GFP and thus glow.
With this technology, you are able to “capture” cells that were active during an experience by molecularly tagging them. Using this technology, some recent experiments have provided extremely compelling evidence in support of engrams being both the cells active during an experience and the cells responsible for memory representations at future time points.
First, observational studies supported this claim. Specifically, experiments tagged cells active in the Lateral Amygdala of animals undergoing fear conditioning training in which a context was paired with an aversive stimulus, i.e. shock. This brain structure is responsible for learning to predict stressful and aversive relationships and is well known to be necessary for appropriate fear learning. Due to the implementation of this technology, cells active during the learning phase of the experiment were tagged with GFP and thus glowed green under the microscope. Days later, experimenters allowed the mouse to explore the same context, which it got shocked in and labelled cells that were active during the second visit using a different colour. During this test session, the animal displayed behavioural signs that it remembers being shocked in this location (freezing). After sacrificing the animals and inspecting the brains, they found a very large degree of overlap between cells active during initial learning and during retrieval of the memory. However, in control mice, which visited a different location on the second day, there was little overlap. This suggests that a population of cells active during the first visit to context A were again active in the same context on a different day, correlating with memory retrieval (engram), but a different population of cells was active when the animal was placed in a different context B (control).
While suggestive, observation data was not sufficient to completely convince the masses. Later experiments manipulated “engram” cells and concluded that these cells are both sufficient and necessary for the representation of a memory. One experiment used a very similar methodology to tag the cells of the amygdala during fear conditioning training. However, instead of just making these cells glow a certain colour under the microscope, they instead made these cells express a toxin originally used by Corynebacterium diphtheriae, a pathogenic bacterium. This would functionally kill the cells active during training but otherwise cause no damage to neighbouring cells within the network. When placed in the same context where training originally took place a day later, the animals showed no behavioural sign of memory retrieval. Based on their freezing levels, it was as if they had absolutely no memory of ever being shocked by that experience. However, they had no problem remembering being shocked in another context and were also able to re-learn the shocking relationship within that original context once they were shocked again. This suggests that killing that small subpopulation within the amygdala had adverse general circuit effects that cause general loss of ability to learn fear relationships. Instead, it caused the very specific erasure of the memory and no other impairments, providing strong proof that engram cells are necessary for the expression of memory in service of the present.
On the other side of the same coin, gain of function manipulations have found engram activation to be sufficient for memory retrieval. In a similar experiment, scientists tagged cells active during fear conditioning training in context A. This tagging was special, as it caused these engram cells to express specialized sodium ion channels that the experimenter can trigger to open and thus cause the engram cell to fire whenever he wished (i.e. at the press of a button). Experimenters confirmed that the animal showed behavioural indications of having the memory as it froze in context A. In a different context B, where no fear learning had occurred, the animal did not freeze at baseline conditions. However, when the experiments pressed this button of theirs and activated all the engram cells that represent being shocked in context A, the mouse suddenly froze as if it was in context A. The experimenters concluded that activation of engram cells was sufficient to cause retrieval of the memory they represent. In science, showing that a variable is both necessary and sufficient for an effect to occur is the empirical gold standard for suggesting causality, as opposed to pure a correlation, between the variable and the effect.
Engram Allocation
Within a network of neurons, there tends to only be a subset active at any one time. Specifically, out of all of the principal neurons found in a network, roughly 30% tend to be sufficiently active during an experience such that they are incorporated into the memory engram of that experience. This is the case even though much more than 30% of the neurons in a circuit could theoretically be eligible to participate in a given memory trace. This is especially true in brain regions that are responsible for flexible learning and less true in primary somatosensory and motor areas.
This may be confusing at first. How are many neurons eligible to participate in the representation of an experience? Surely if the experience is unique, and every neuron in the brain has a predetermined meaning, then neural codes for different experiences must also be predetermined. However, while this may be intuitive, it is not the case. When someone experiences a completely novel experience, there is actually a competitive race amongst neurons in a brain region to participate in the representation of the experience. Some win and some lose, so at a population level, there are different neurons that represent different experiences. Once a neuron becomes active during the experience of a novel event, it then acquires the ability to represent the experience from scratch. In the future, the activation of this neuron in tandem with the rest of the sub-population active during the experience itself would trigger memory retrieval.
Why is this the case, and why does it fly against intuition? It is because the assumption that neurons have predetermined meaning is false. Empirical data suggests that neurons acquire meaning after an experience and represent a part of that experience. That neuron might become involved in the representation of multiple different experiences, but the total sub-population would be different for each experience.
This is not the case across the whole brain. In higher-order structures related to learning, such as the hippocampus, lateral amygdala, and prefrontal cortex, there are many eligible neurons that can participate in a memory engram. However, in more primary somatosensory and motor areas, there is much less variation. Neurons that represent your finger tend to be more or less consistent across all representations of your finger.
How and why do neurons in higher-order regions have this ability to acquire any new meaning at experience? It is because these neurons have a multitude of silent synapses with different primary sensory and primary motor neurons distributed across the cortex. For example, a neuron in the lateral amygdala, called neuron A, could have synapses connecting it to neurons in both the auditory cortex and to the visual cortex. Both are silent; i.e. there are no neurotransmitters at the pre-synaptic terminal. Now, if an animal undergoes a fear conditioning paradigm linking a sound to a shock, a race would start amongst neurons in the lateral amygdala. The first 30% to become active will win, after which network homeostasis will prevent any more neurons from activating, deeming the rest of the 70% as losers of the race. These 30% of neurons now represent the fear engram and will be termed fear engram cells. If neuron A was in the fear engram, then that means that it was active during the tone-shock pairing. That means at the same time that neuron A was active, tone representing cells in the auditory cortex were also active. Due to the mechanisms of LTD, these synapses will become unsilenced. Next time the animal hears the sound, because of synaptic strengthening, tone cells in the auditory cortex will activate neuron A, which will activate other cells in the fear memory engram, which will cause a freezing response. Theoretically, if this fear conditioning paradigm replaced sound with light, it is possible that neuron A will have helped represent a light-shock parking instead of a tone-shock pairing. Neuron A only acquires its meaning at encoding and continues to represent that memory when activated with the rest of the engram. Neuron A can also become a part of different engrams in the future, so long as the rest of the representational population is different.
In some places, meaning is predetermined. For example, cells in your visual cortex that are attached to red perceiving cones have an anatomically predetermined meaning, which is to represent the colour red. In all the interesting brain regions, it is practically never predetermined.
It is thus empirically supported that higher-order neurons can change their synaptic weights and “learn” to represent many different things. If it wins the race for activation during the initial experience, then it will aid in representing the memory of that experience. What is this race exactly, though? What determines which of the eligible neurons become part of the 30% active at any one time point. The answer is intrinsic excitability.
Even amongst a population of neurons all at resting membrane potential, there exists variation in intrinsic excitability. Intrinsic excitability is the proclivity a neuron has to fire in response to a given excitatory input. In other words, if all neurons were to receive the same excitatory input, some will fire an action potential while others will not. This is because of variation in intracellular proteins and transcripts, which result in different metabolic states. Neurons with the greatest intrinsic excitability at any one time point will be highly biased to become a part of a memory engram representing that specific time point. A landmark experiment found that neurons with high CREB levels have higher intrinsic excitability. Afterwards, by artificially creating a subpopulation of neurons that over-express CREB, they found that these neurons were highly biased to become a part of a fear memory engram relative to chance levels. Conversely, experiments inhibiting CREB in a subpopulation of cells found that these neurons were excluded from the memory engram at a far greater rate than would be predicted by chance.
Meaning is created, not predetermined: An intuitive example
Have you ever looked at your lectures notes after a class and thought to yourself, “it would be tough for somebody else to understand my notes if I were to just hand it over to them”? When writing our notes, we tend to take shortcuts in writing them, and despite the logical leaps that are a part of these shortcuts, we are confident that we will understand them in the future because we have the initial memory of writing it in the first place. This is analogous to how a code (in this case, your scribbled down notes) might be difficult to extract meaning from intrinsically but becomes much more meaningful after you have experienced creating the code. This is analogous to how a neural code is consolidated with meaning after an experience and becomes meaningful only after the experience.
Another example: let’s say Steve is trying to memorize the first seven cranial nerves. He creates an acronym: Ooo, Tim Tom Ate Five Cakes! During the test, If Steve sees this acronym, it will help him help retrieve the seven cranial nerves.
However, let’s say two weeks before Steve took HMB200, someone gave him a piece of paper with this acronym written on it. It will mean nothing to him. It is the exact same code, but without experiencing the encoding process, the code means nothing. Similarly, a neuron code might activate and mean nothing. Two weeks later, the exact same neural code might activate and have a very robust meaning. The difference is that some form of encoding happened in the middle, by which what was for a long time simply a random subset of neurons became an engram population.
You can really trace this back in time indefinitely. Even the very letters that you are reading only have meaning because you stared at very similar shapes while someone uttered the sounds that the letter represents systematically for hundreds of thousands of hours across your lifetime. Language is different across cultures precisely because meaning is created as humans experience life. Very little is predetermined.
Changing Intrinsic Excitability
The intrinsic excitability of a network is not consistent; it is always changing. At any one time point, a neuronal network has a very specific pattern of excitability, which is different later on. There are two main factors that cause intrinsic excitability to change; 1) time and 2) past experience.
The more hours that pass, the more different an excitability pattern becomes. Some neurons become less excitable, while others become more excitable. While the exact mechanism as to how this time-dependent effect appears remains elusive, the fact that it exists is apparent. As a result of this, engrams for two experiences that occur very close in time are largely overlapping. That is to say, many neurons that represent experience 1 will also represent experience 2. Experiences close in time thus have similar memory traces. Theorists pose that this could facilitate the emergence of a mental timeline within our memories. Since memories close in time have overlapping representations, this can be used as a code to represent the close temporal proximity of these events.
After a neuron has been activated, it remains in a slightly more excitable state for multiple hours. Remember from the previous section that the opening of NMDA receptors in a neuron triggers increased CREB, which then maintains increased excitability of the cell in parallel to its roles in LTP. For these hours in which CREB is over-expressed in that recently active neuron, it is more excitable. A compelling experiment was done where scientists picked a random subpopulation of cells within a network and electrically stimulated them. A couple of minutes later, they trained the mouse under a novel paradigm. They found that the memory trace/engram representing that training episode was largely populated by cells that they stimulated minutes ago, even though they picked those cells randomly. This is because their past stimulation made them more excitable for a prolonged period of time, increasing their probability of being incorporated into new memory engrams.
Why is this beneficial? Theorists suggest that it might be pivotal in our formation of semantic networks. For example, imagine that you are taking an HMB lecture and learning about LTP but only finish half of the subject matter. A certain population of cells is active during this learning and eventually encodes your memory of LTP as you had learnt it. Next week, you come back to class to learn the rest of LTP. As you go through the lecture, you must build on what you have previously learnt so you recall many of the facts that were taught to you last lecture. During these recollections, you re-activate cells from last week that represented the last lecture you learned. As a result of this, these re-activated cells now have greater intrinsic excitability and are more likely to become a part of these lecture’s engram. Even though the two lectures were separated by one whole week, the resultant lecture 2 memory engram has many cells that overlap the lecture 1 engram, merging these concepts within the same interconnected web. Now when you retrieve your knowledge of LTP, you will retrieve information from lecture 1 and lecture 2 in tandem, resulting in a more holistic knowledge base and improving your ability to answer questions on the topic.