The Modal Model of a Working Memory

Working memory is quite a difficult concept to understand. Largely, this is because working memory has been classified as a branch of short-term memory (STM). Today, people are still not exactly sure whether to separate them. Working memory is concerned with immediate processing. For example, holding a phone number in your head whilst trying to find the phone. Arguably, this rote repetition can be considered a form of short-term memory learning. However, for the sake on the model we are about to discuss. I will consider working memory as separate from short-term memory.

Atkinson & Schriffin

The first major working memory model was proposed by Atkinson & Schriffin in 1968. Atkinson & Schriffin argued that information enters into our memory from the environment. Information which is then processed by two sensory memory systems: iconic and echoic. Unlike later models, this model states that any form of rehearsal is sufficient for learning. The only thing that determines whether or not information enters long-term storage is the length of time information spends in short-term storage. Also unlike other models, Atkinson & Schriffin believe that the short term memory store serves as the working memory store. The implications of this type of model is that long-term memory (LTM) is entirely dependent on short-term memory, and that levels of processing are irrelevant.

Major Criticisms

Today, Atkinson and Schriffin’s model is not accepted because a myriad of studies have proven that major factors of their model are implausible. Four major sources of criticism are neurological evidence that proves LTM can be sustained even with damage to STM, serial position effects, levels of processing and Baddeley and Hitch’s 1974 experiment.

Shallice and Warrington (1969) studied patient KF who suffered from poor STM but his LTM remained intact. KF suffered damage to his left parietoccipital region of the brain. KF showed very poor digit span (less than 2) but showed normal performance on a LTM task. KF’s performance proved that an intact STM was not necessary for a normally function LTM. Of course, in order for new information to enter long-term memory and intact short-term memory is an important feature. However, it does mean that information that is already stored in long-term memory is not effected by damage to short-term memory stores.

Two studies carried about by Tzeng (1973) and Baddeley and Hitch (1977) respectively disproved the modal model by testing for the serial positioning effect. The serial positioning effect suggests that learning words at the beginning or end of a list is easier than learning the middle words. Tzeng (1973) conducted a test of free-recall where participants were given a list of words. An interpolated task was introduced to disrupt recall. Zheng discovered both primacy and recency effects. Even though the modal model believed that interpolating after every word would remove the word for STM, primacy and recency effects were still observed. Baddeley and Hitch studied rugby players recall of the names of players they had previously played against. Baddeley and Hitch found that the more recent the game, the more names they recalled. Thus, they were able to suggest the it is unlikely the recency effect is due to limited short-term storage capacity.

Another experiment carried out by Baddeley and Hitch in 1974 suggested that working memory and short-term memory are in fact separate entities. Participants had to carry out dual tasks: digit span and grammatical reasoning. There was a significant increasing in reasoning time, but participants suffered no accuracy impairment. The results suggest STM and WM serve separate roles.

Lastly and most obviously, learning depends on more than just the amount of time it spends in short-term storage. Levels of processing are important when it comes to how well we learn. Learning depends on how material is processed (Craik and Lockhart 1972). Deep and meaningful information is far more permanent than shallow, sensory processing. Craik and Lockhart suggest that there are two major forms of rehearsal and maintenance and elaborative. In a test of this theory, Hyde and Jenkins (1973) gave participants a list of several words and asked them to complete tasks. This tasks different in the amount of processing involves: the first, rating the word for pleasantness of meaning, and the second, detecting letter occurrence. The results showed a significantly higher recall of the participants in elaborative (meaningful) processing condition.

Spotlight Study: Delayed Gratification

Marshmallow

In the classic, late 1960s experiment, a researcher placed a marshmallow in front of a child. If the child could resist the marshmallow for 15 minutes they were told they would get a second marshmallow. The large majority of the children, with an average age of four, could not wait the 15 minutes to receive a second marshmallow. Follow up studies also confirmed these findings and lead to further investigation into the role delayed gratification played in a child’s future success. What they discovered was a revolutionary correlation between the amount of time a child could delay gratification and their success as an adult. Furthermore, delayed gratification was shown to be a better predictor of later success than intelligence.

A new study published in the October 2012 of Cognition has cast new light on this study, by showing that there may be a rational decision between certain children’s inability to wait for that second marshmallow. Doctoral candidate Celeste Kidd of the University of Rochester, the lead author of study, hypothesised that the rationale is based on the likelihood of perceived trust in getting the second marshmallow.

marshmallow1

According to Kidd’s hypothesis, delayed gratification is not a rational decision if a child does not trust in the researcher. Living in poor socioeconomic conditions fosters a mistrust in delayed gratification, and so Kidd believed that recreating an environment of mistrust in experimental conditions can reduce the time a child could delay their gratification.

Kidd et al. gave children some poor quality art supplies and told them that if they could resist them, a researcher would return with better art supplies. In the ‘reliable’ condition, the researcher did return with better quality art supplies, but in the ‘unreliable’ condition, the researcher did not return with better supplies stating they did not have them. Like in the original experiment, these children were also an average age of four, and the majority of the children could not wait the full fifteen minutes. However, the difference between the ‘reliable’ and ‘unreliable’ condition is that in the first condition that child could wait an average of 12 minutes, where as in the second they only waited an average of 3 minutes.

marshmallow2

These findings suggest a strong association between how long a child was able to wait for the supplies and various measures of mental health, competence and success in later life.

In the article on delayed gratification featured in the March/April 2013 edition of Scientific American Mind an imaging study was mentioned that studied the kids in the original 1960s experiment. This study found significant differences in the activity of key brain areas between those who could and could not resist the temptation of the marshmallow. This counters the idea that only self-control plays a role in resisting temptation  socioeconomic status, quality of parenting and environmental factors also play a crucial role.

Citation:

Makin, Simon. ‘Delayed Gratification “A Marshmallow in the Hand.’” Scientific American Mind March/April t 2013: 8. Print.

Heuristics Representativeness

What are Heuristics?

People rely on heuristics because they facilitate the task of assessing probabilities and predicting values; they allow us to make decisions quickly and instinctually. Although heuristics like schemas are often inaccurate, people look for evidence of that the heuristic or schema is true and ignore failures of judgment (Tversky and Kahneman, 1974). Heuristic errors are known as systematic errors, and they occur because the heuristic cannot cope with the complexity of the task. Heuristics simply lack the validity.

Representativeness

Representativeness is when probabilities of B are evaluated by how much it resembles A, taking for granted the degree to which A and B are related (Tversky and Kahneman, 1974). Usually, representativeness heuristics are quite accurate because if A resembles B, there is a likelihood that they are somehow related. Unfortunately, similarities can be misleading as they are influenced by factors, which need to be taken in considering when judging probability.  Factors that influence similarity include: prior probability outcomes, sample size and chance. 

– Insensitivity to Prior Probability Outcomes

A major effect on probability is base-rate frequency. For example, even though Steve has the characteristics of a librarian compared to a farmer, the fact that there are many more farmers than librarians in his population needs to be taken into account when assessing the likelihood of him having one occupation over the other. If in Steve’s population there are a lot of farmers because of the rich soil in his area, the base-rate frequency suggests that Steve is more likely to be a farmer than a librarian.

In 1973, Kahneman and Tversky conducted an experiment to show how often people overlook base-rate frequency when assessing probability outcomes. Participants were shown short personality descriptions of several individuals sampled from 100 people. The 100 people consisted of lawyers and engineers, and the experiment task was to assess which people were likely to be lawyers or engineers. In condition A, participants were given the following base rate, 70 engineers and 30 lawyers. In condition B, the base rate was 30 engineers and 70 lawyers. The two conditions produced virtually the same probability judgements despite the significantly different base rate probabilities clearly given to the participants. Participants only used the base rate probabilities when they were given without personality descriptions.

Goodie and Fantino (1996) also studied base-rate frequency. Participants were asked to determine the probability that a taxi seen by a witness was blue or green. Even though participants were give the base-rate of taxi colours in the city, participants still determined probability by the reliability of the witnesses.

– Insensitivity to Sample Size

Another major effect on probability is sample size. The similarity of sample statistic to a population parameter does not depend on size; therefore, if probabilities are assessed by representativeness, the probability is independent of sample size. Tversky and Kahneman (1974) conducted a participant to show evidence of insensitivity to sample size. Participants were given the following information:

–       There are two hospitals in a town, one small and one larger

–       About 45 babies are born per day at the large hospital

–       About 15 babies are born per day at the small one

–       50% of the babies born are boys, but this figure differs slightly everyday

Participants were then asked which hospital is more likely to report a day with 60 male births. Most of the participants answered that the hospitals are equally likely to have 60 births. However, sample theory warrants that the small hospital would be more likely because the larger hospital is less likely to stray from the mean.

newborn

Unfortunately, the general public are not the only ones to fall victim to sample size. In 1971, Tversky and Kahneman conducted a meta-analysis on experienced research psychology, the majority stood by the validity of small group sizes. The researches simply put too much faith in their results from small sample sizes, underestimating the high chance of representativeness. It is likely that the reason the benefits of a large sample size are drilled into psychology students from day one is to avoid errors like this.

– Misconceptions of Chance

People expect that a randomly generated sequence of events will represent the essential characteristics of that process even when the sequence is short. In other words, people thing that the likelihood of getting H-T-H-T-T-H is more likely than H-H-H-H-H-H even though they are equally likely. This because every T or H has to be assessed as an individual probability event. In other words, in trail one, you have a 50% chance of getting a T or an H. In the second trail, the results of the first have no impact; therefore, you again have a 50% chance to get either letter. Probability matching is another word for this misconception of chance. Andrade and May (2004) describe another scenario based on real life misconceptions of chance.

First, participants are given a jar of 100 balls and told that 80% are white and 20% are white. The most commonly observed strategy when asked what ball colour will become next is one that imitates the pattern of 20% white and 20% red. In reality, the most efficient strategy is to say red for every draw because the probability event, as stated above, needs to be assessed for each individual draw not for the task as a whole. The implications of probability in gambling are huge, so it is not surprising that the gambler’s fallacy is another name for probability matching. People simply believe in the “law of averages,” that if an event has occurred less often that probability suggests, it is more likely to occur in the future.

– Insensitivity to Predictability

Another issue with probability is insensitivity to predictability, which is when people ignore the reliability of a description and instead pay attention to in related factors. For example, a person will pay more attention to a particular review and given greater reliability if the person’s name is the same as yours. Another example would be ignoring negative reviews and only paying attention to positive ones because they confirm your own belief. Obviously, doing so means disregarding the reliability of the evidence.

nicolas_copernicus_lecturing_wellcome_m0013598

Tversky and Kahneman conducted an experiment in 1973 in which participants were given several descriptions of the performance of student teacher during a particular lesson. Some participants were asked to evaluate the quality of lesson described into percentile scores, and other participants were asked to predict the standing of the student teacher five years after the practice lessons. The judgments of the second group were based on the other participants’ evaluations. Even though the participants were aware of the limited predictability of judging a person’s performance five years into the future, they expressed high confidence in judging the student teacher’s performance to be identical to now. Sadly, high confidence in the face of poor judgment of probability is common and known as the illusion of validity. Confidence displayed by people in their predictions usually depends on representativeness and regard for other factor is usually ignored; the illusion persists even when a person is aware of the limited accuracy of prediction (ibid).

–  Misconceptions of Regression

People simply do expect regression to occur even in contexts where it is common (Tversky and Kahneman, 1974). A good example of regression towards the mean is with height; two above average height parents are more likely to have a child of average height than above average height. Despite this fact, people tend to dismiss regression because it is inconsistent with their beliefs. Failure to accept the truth; however, leads to overestimation of the effectiveness of punishment and the underestimation of the effectiveness of reward.

– Implicature

Implicature refers to what a sentence suggests rather than what is literally said (Blackburn, 1996). For example, the sentence “I showered and went to bed” implies that first I showered and then I went to bed; however, if you take the sentence literally, I could mean I went to bed and then I showered in the morning. Both possibilities are true, but given the context it would be strange for me not mean that I showered before going to sleep. Sometimes a qualification is added, which adds new information to the context. Even if the sentence is not altered itself, a qualification clarifies our implication. An example of a qualification would, in that exact order: “I showered and went to bed, in that exact order.”

– The Conjunction Fallacy

The conjunction fallacy, first outlined by Tversky and Kahneman in 1983, refers to tendency to believe that two events are more likely to occur together rather than independently. The example provided by Tversky and Kahneman is as follows:

Linda is a bank teller question is a good example:

“Linda is 31 years old, single, outspoken, and very bright. She studied philosophy at University. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-war demonstrations.

Which is more likely?

A) Linda is a Bank Teller

B) Linda is a Bank Teller and is active in the feminist movement.”

The results showed that people chose B more often than A because they see the information above as indicative of Linda’s personality, and B seems like a better fit for her as a person. The truth is than A and B do not have the same likelihood because there is no way of knowing if she is a feminist or not. Linda is a bank teller is obviously a fact. Regardless of that, the general public as well as statistical experts still rely on the representative heuristic, ignoring the probability a play.

Overconfidence

People tend be overly confident in their estimates even when they are pointed out the irrationality of their thinking (Fischhoff et al. 1977). Baron (1994) found evidence that one reason for our inappropriately high confidence our tendency not to search for reasons why we might be wrong. As ridiculous as this may seem, studies consistently confirm that people ignore new information to hold on to their original belief. Weinstein (1989) reported a study where racetrack handicappers were slowly given more and more information about the outcomes of the race. Despite becoming more informed, the participants held onto their original belief with more confidence. DeBondt and Thaler (1986) propose that when new information arrives, investors revise their beliefs by overweighting the new information and underweighting the earlier information.

 

Availability Heuristics

Availability

People tend to rate the frequency of an occurrence or the probability of an event by the ease with which you can remember it. Availability is usually an effective heuristic tool because if you can easily remember an event, there is a chance that you just experienced it or were exposed to it. However, just as with similarity and probability, availability is affected by other factors as well, namely in the form of biases.

Biases Due to the Retrievability of Instances

When the size of a class is judged by the availability of instances, a person is displaying a bias due to the retrievability of instances. A class that is difficult to remember is likely to be judged as having fewer members. Salience and recency are largely biased by how easy them come to mind.  In 1973, Galbraith and Underwood showed that because it is easier to think of abstract words used in various contexts compared to concrete words, participants judged the frequency of the word occurrence of abstract words and higher than concrete words. Even though the concrete words are more common, abstract words are not contextually constrained so they appear more salient.

availability

Another example is evident from the Lichenstein et al. 1978 study. They asked participants to rate the likelihood of particular causes of death. He found that participants believed that accidents caused as many deaths as disease, and that murder was more common than suicide. In reality, diseases cause sixteen times as many deaths compared to accidents, and suicides are almost twice as common as murders. Availability heuristics suggest that murders and accidents are common because media coverage is far greater. As various types of media are far more accessible to us compared to disease or suicide statistics, information spread by the media becomes far more salient. Hence, the more salient information often translates into a false perception of its occurrence.

Biases of Imaginability

The imaginability of biases is a heuristic when one has to assess the likelihood of an event not using instances stored in your memory but according to a particular rule or task. For example, consider a group about to embark on an expedition together. As part of the preparation they must consider all possible difficulties they might encounter. Based on this “rule” the ease it takes them to recall possible difficulties in the wake of their trip seem more likely than they actually are. Again, discussing instances of difficulty heightens salience to a point of interference.

Illusory Correlation

Chapman and Chapman (1967) described illusory correlation as the judgment of the frequency with which two events co-occur. The same year, Chapman and Chapman carried out a study investigating the strength of illusory correction. Participants were presented with several hypothetical mental patients. Each patient came with a clinical diagnosis and a drawing of a person made by that patient (Draw-a-person test). The results showed that participants overestimated the frequency of co-occurrence of “natural associates” such as suspiciousness and peculiar eyes; a finding remarkably similar to the overall clinical reports of the same task. Despite being presented with contradictory data, the illusory correlation remained resistant to the point of preventing participants from detecting relationships that were present.

Recognition Heuristic

As the name suggests, with recognition heuristics, a salience bias is established by how well we recognise an object, name, etc. Goldstein and Gigerenzer (2002) asked German and American students about four cities. Specifically, which city did they think was larger: San Antonio or San Diego? And Hamburg or Cologne? Contrary to expectations, the results showed that American students were more accurate for the German cities, and the German students were more accurate for the American cities. Unlike the other studies, heuristics actually benefited the students when they new little about the other country. San Diego is larger and also featured more often in films or the news; the same is true for Hamburg. Therefore, guessing that San Diego and Hamburg are the larger makes sense for foreign students because they have a very limited amount of information available to influence salience. For the natives, answering the question becomes more difficult because both cities are salient and factors such as their own familiarity or experiences are likely to interfere with their answer.

Biases Due to the Effectiveness of a Search Set

Lastly, Tverky and Kahenman (1974) describe the bias due to the effectiveness of a search set. The easiest way to describe this bias is through an example. For example, if participants are asked to list words beginning with and containing “T” it is easier to think of words starting with it than containing it; therefore, words beginning with “T” have greater salience. As has been established, greater salience means participants are likely to list far more words beginning with “T.”

Anchoring and Adjustment

Anchoring and adjustment is a heuristic where people make estimates by starting from an initial value, which is subconsciously adjusted to provide the final answer. Typically, the adjustments made to the initial value are insufficient. Studies have focused on presenting two groups of participants with different starting points. Consistently, different starting points yield different estimates (Slovin and Lichteinstein, 1971). Tversky and Kahneman (1974) based their experiment on this hypothesis. Participants were asked to estimate various quantities in percentages (like the number of African countries in the UN). For the initial value, a wheel of fortune was spun to determine the number. Participants then had to determine whether the number was higher or lower than the number that was spun. Results showed that their initial number significantly influenced participants.

Biases of the Evaluation of Conjunctive and Disjunctive Events

Studies of choice amongst gamblers show that they tend to overestimate the probability of conjunctive events and to underestimate the probability of disjunctive events (Cohen and Chesnick, 1972).  The basic idea is that people take more gambling risks to avoid losses than for a possible gain. In relation to anchoring and adjustment, gamblers tend to anchor their choices to his mentality, failing to adjust for probability. For example, when participants are given a simple task (choosing a red marble from a bag 50% red, 50% white), conjunctive task (choosing a red marble seven times in a row from a bag contained 90% red, 10% white), and a disjunctive task (choosing a red marble at least one in 7 goes with a bag contained 10% red and 90% white), participants will choose the disjunctive task. The disjunctive task is framed to avoid losses, but the other tasks have better odds. Regardless, the gamblers avoid adjustments to their mentality and therefore have to deal with lower odds. Dawes (1988) found that even when people are pointed out the inconsistencies in their thinking, they still stick by their original choice.

darts

Tversky and Kahneman (1974) devised the prospect theory to explain the behaviour of loss aversion. The theory assumes that individuals identify with a reference point, and that people are more sensitive to potential losses than to potentials gains. As such, people are more willing to accept lower winning odds if it means lowering the possibility of losses. Of course casinos would be bankrupt by now if everyone person thought this way. Distorted judgement, wishful thinking or even sheer desperation affect the way making decisions can override our natural instincts (Edwards, 1968).

Understanding Risk: Gigerenzer and Edwards, 2003

Single event probabilities are probability statements about a single event; they leave the class of events for which the probability refers to open for interpretation. For example, there is a 50% chance of a pop quiz tomorrow. References class can be area, time of day, etc., anything which alter the meaning of the single event probability statement. Framing appropriately can avoid confusion. Framing is the expression of the same information in multiple ways.  As with gambling, framing can be presented positively vs. negatively, by gains vs. losses and also by reference classes.

Framing and Single Event Probabilities

In 2002 Gigerenzer worked with a psychiatrist and his patients to observe first hand the importance of reference frames. The psychiatrist prescribed an anti-depressant to his patient and told them that they have a 30-50% chance of developing a sexual problem.  Even though the psychiatrist thought he was clear in saying that out of every 10 patients, 3-5 will experience sexual dysfunction the patients misunderstood, his patients still believed that in 30-50% of their sexual encounters something would go wrong. Gigerenzer asserted that all the misunderstandings could be reduced or avoided by specifying a reference class or only using frequency statements (3 out of 10 patients rather than 30-50%).

darts 2

Framing and Conditional Probabilities

Conditional probabilities refers to the likelihood of A given B. For example, if someone has low mood and anxiety then the likelihood of having depression is high. Conditional probabilities should not be confused with B given A. Conditional probabilities are very important with disease detection. Gigerenzer and Edwards, 2003 attest the framing of probabilities can easily confuse patients. As with the psychiatrist’s warnings, giving information in a clear, alternative format can reduce confusion, in this case, natural frequencies.  In 2002, Gigerenzer found that even doctors struggle with reference classes when they are presented as probabilities. In a study with 48 doctors, they were asked to estimate the probability that women with a positive mammogram result actually has breast cancer. Of all the doctors who received conditional probabilities, very few gave the correct probabilities. On the other hand, most doctors who were given natural frequencies gave the correct answer. An example of what the doctors were given is as follows:

CP: The probability that woman has breast cancer is 0.8%

Natural frequencies: Eight out every 1000 women have breast cancer.

CP: If a woman has breast cancer, the probability that the mammogram will short a positive result is 90%.

 NF: Of every eight women with breast cancer, one will get a false negative.

Wright 2001, explored why women having taken a mammogram misunderstand relative risks. Observations showed that most women misunderstand relative risk because they think that the number related to women like themselves who take part in the screening. In truth, relative risks actually relate to a different class of women: women who die of breast cancer without being screened. Confusion can be avoided by using natural frequencies of absolute risks.

Positive vs. Negative Framing

Positive framing is more effective than negative framing in persuading people to take risky treatment options (Edwards et al. 2001; Kuehberger, 1998). For example, a 70% success rate for a surgery sounds more promising than a 30% failure rate. Gain or loss framing is equally important in communicating clinical risk; however, loss framing tends to be more persuasive when promoting action.  The most obvious example is loss framing for screening (Edwards and Mulley, 2002); listing the number of deaths for women that did not get screen for breast cancer is scarier than presenting the natural frequency of deaths due to breast cancer for the entire population. Manipulation can also be presented in the form of charts and population crowd figures (Shapire et al. 2001).

 

Causality and Animacy

Introduction to Causality

Functional relations are fast, automatic, subconscious and driven by a stimulus. They allow us to perceive properties of simple displays that in themselves objectively do not exist (Scholl and Tremoulet, 2000). In other words, we perceive the objects but the relationships between them are interpreted without a physical basis. Types of relationships that exist due to functional relations include animacy and causality. Both relations were complex and require higher-level cognitive processing, but the stimuli involves are basic. Because these very basic stimuli induce high-level perception, it suggests that our visual system not only determines physical structure but also the properties of the stimulus (ibid).

Michotte and “Illusions of Causality”

A type of functional relation called kinetic depth effect produces causality and animacy. As the name suggests, movement of stimuli at specific intervals causes the perception of a relationship. Michotte conducted one of the best and earliest demonstrations of perceptual causality in 1946 including the launching effect, the entraining effect, the launching effect with a temporal gap, the triggering effect, the launching effect with a spatial gap and the tool effect. His strategy was to show observers various ways that simple stimuli could interact to influence perception of causality. He called it the “illusion of causality” and described it in terms of A causing B’s motion or B’s motion as a continuation of A’s motion. Animacy and causality, Michotte believed, have these qualities because simple motion cues are the foundation for social perception in general (gestures, facial expressions, speech, etc.). Babies rely on these simple motion cues for survival, which explains why it’s a necessary skill from birth. Despite critics’ disapproval of Michotte testing these perceptions on himself and his colleges, these percepts have been proved culturally universal (Morris and Peng, 1994; Rime et al. 1985).

In 1963, Michotte outlined factors that influence our perception of causality, which are vital for the effect to occur: priority, timing, proportionality and exclusivity. If any of these factors are violated, the perception of causality can be broken. Timing, as determined below, has the greatest effect on causality. However, it is important to understand the significance of the other factors as well. First, priority: for A to move B, A must move first; second, proportionality: the action of B must be proportional to the movement of A, this includes consistent velocity, trajectory, etc.; thirdly, exclusivity, for A to cause B’s movement, A must be the only possible factor to have done so.

To further the work on causality, White and Milne (1997) created another simple display based on Michotte’s work. They created another simple display where one stimulus appeared to pull another. Again, White and Milne found the phenomenon of causality to be salient, immediate and irresistible. Further research suggests that the phenomenon is so engrained that even as babies causality is perceived. Leslie and colleges (Leslie et al., 1982, 1984, 1987) researched this principle on six-month-old infants. In the infants were habituated with a short film based on Michotte’s original displays. After habituation, infants found a reversal of the film more interesting. Leslie (ibid.) argues that six month olds looked longer at the reversed displays because it involved in an additional change in the causal roles.

Heider and Simmel

In, Heider and Simmel (1944) created a film showing three geometric figures – a large triangle, a small triangle and a small circle moving around a rectangle with a small opening. Although static clips of film show very little information about the motion properties of the shapes, after viewing the animation observers were consistent when describing the shapes. Observers attributed personality traits and emotions to the geometric figures, regardless of the instructions they were given. Temporal contiguity and spatial proximity created relationships among the figures, such as the large triangle chasing the other two figures. Just as with Michotte’s experiment, several researchers replicated the study and confirmed that they are universal (Morris and Peng, 1994; Rime et al. 1985; Hashimoto, 1966). Berry and Springer (1993) found that even three and four year olds attributed personality traits (f.eg. desires and emotions) to the geometric shapes. 52, replaced the geometric shapes with other simplified objects; this had no effect. The effect of the Heider and Simmel movie was only significantly reduced when the movie was temporally tampered with.

Stewart et al. (1982) proposed that we perceive animacy because when we see movement without a source, we subconsciously attribute the movement to a hidden energy source, in this case animacy.  This energy violations hypothesis delineates three conditions were animacy is assumed:

1. An object starting from rest

2. Direct movement towards a goal

3. Sharp changes in direction to avoid obstacles 

The Importance of Continuity and Contingency 

Temporal tampering has such a significant effect because perceiving animacy depends on the interaction between target and goals and the impression of intentionality in the movement (Dittrich and Lea, 1994). Dittrich and Lea also found that participants reported more intentional perceptions when the trajectory and the speed of the target object was greater than that of the distractors.  Wasserman and Neunaber (1986) also argue that people are sensitive not just to the pairings of cause and effect but also to the contingency or temporal correlation between them (relative continguity). Causal judgments are reduced when the action postpones the occurrence of the outcome; relative contiguity and thus causality will decrease as the delay between action and outcome increases.

Another approach assumes that contingency and contiguity effects are mediated by independent mechanisms (Shanks and Dickinson, 1987). Causality judgments based on contiguity-sensitive mechanisms the result of associative conditioning. The occurrence of an outcome increases the associative strength of stimuli and actions. Their approach is consistent with the dP theory which argues that judgments of contiguity are based upon the differences in the perceived probabilities of conjunction and disjunction of the outcome with the action. A study carried out by Williams (1976) offers support for the dP theory. Wililams’ studied was carried on pigeons, with the dependent variable of performance of pigeons on simple variable-interval schedules with unsignalled delays of reinforcement. He found that the imposition of programmed delay of between 3-5 seconds produced reductions in response time.

The Neuroscience of Causality and Animacy

n 1998, Heberlein and colleges showed patients the Heider and Simmel movie. They found discovered that patients with amygdala damage did not describe the shapes using any social or anthropomorphic terms. These findings make sense since the amygdala is responsible for experiencing emotions. Even though the amygdala was damaged, the participants were able to judge causality and animacy suggesting other parts of the brain are necessary for judging these perceptions.

brain sliceA PET study conducted by Happe and Firth (1999) found that that intentional movement activated more activity in the random-movement displays in the tempoparietal junction, fusiform gyrus, occipital gyrus and medial frontal cortex compared to goal-directed movement. Processing in these parts appear to be domain specific or organised into modules. Evidence is based on Fodor’s (1975) descriptions of modality. Perception of causality and animacy is domain specific, processing is restricted to specific causal and intentional interpretations, the process is entirely a visual phenomenon and the causality and animacy are fast, automatic and innate.

Receptive Fields: Simple Cells and Tuning Curves

Receptive Fields

The jumping spider makes use of pattern recognition to distinguish prey from mate. Their eyes allow them to detect specific features or templates, specifically, the bar-shaped feature similar to the legs of other spiders (Land, 1969). Frisby and Stone (2010) discuss the jumping spider because they provide an excellent paradigm for how template matching works through a series of steps.

–       First, the original image is projected on to the retina of the spider’s eye

–       Second, the spider’s eye focuses on a specific feature, in this case the leg

–       Then the leg’s image is cast onto the retina, and receptors project this information to the primary visual cortex (V1)

–       Neurons in V1 receive inhibitory and excitatory input from the receptors in the retina

–       The receptors not blocked by the leg have increased activity and send an excitatory light signal

–       The receptors blocked by the leg have decreased activity and send an inhibitory light signal

–       The neuron gathers the total input and if it exceeds its threshold, firing indicates that a bar is present

Template matching, whether in a spider or human, relies on encoding of a pattern whose shape directly matches the input pattern to be detected. Striate cortex cells in V1 are found in all mammals. These cells receive input from the retinal fibres, and each cell is responsible for a limited patch of the retina (receptive field). Accordingly, cell types in the striate cortex are classified according to their receptive fields (Hubel and Wiesel, 1981). Striate cells are broken down into two parts: simple cells and complex cells.

Templates can, however, be impractical because the amount needed would be exponential resulting in a binding problem (Frisby and Stone, 2010). For example, if we have 18 templates for orientation sensitive to 18 sizes, which are sensitive to 18 shades, you can image the ridiculous amount of templates you would need. This problem is known as a combinatorial explosion.

Simple Cells

Simple cells are named simple because they can be simply mapped into excitatory and inhibitory sub-regions. Simple cells are optimally excited by bar-shapes, which is why it makes sense that simple cells are also called slit- and line – detectors. Slit-detectors respond to a light bar on a dark surrounding, and line-detectors respond to the opposite (Hubel, 1988).  Light on dark or vice versa is important because simple cells respond best to patterns that generate luminance differences, in other words, edges. However, because simple cells are so sensitive to edges, the orientation of a bar is important. The optimal stimulus for a simple cell, to emphasize luminance differences, is one that provides maximum excitation and minimum inhibition (Frisby and Stone, 2010).  To provide maximum excitation and minimum inhibition, different orientations are dealt with by different cells (Hubel and Weisel, 1962). The angle to which each cell is tuned is determined by the pattern of its excitatory and inhibitory regions. ‘Slit’ and ‘edge’ simple cells exist for a full range of orientations, which is reflected by the brain’s wiring; the fibres going from the retina to cortex differ depending on which orientation they represent (Frisby and Stone, 2010).

Population Code

As discussed in the previous section, bars maximally excite simple cells; however, cells still respond even when they are not maximally excited. If part of a cells visual field is activated, a partial response will be initiated. As such, context is vital to making sense of our visual field. For example, a non-vertical stimulus stimulates the vertically oriented receptive field just as well as the vertical but faint edge. In order to distinguish between the two outputs, they must be considered in context of the activities of cells examining the same retinal patch.

Fortunately, having sensitive simple cells makes interpolation between neighboring orientation measurements possible. This “talk” or interpolation between cells is known as a “population code.” Even though there are only simple cells for 18-20 different preferred orientations, we manage discriminations of less than <0.26 degrees (Frisby and Stone, 2010). Communication between cells allows us to discriminate when orientations vary very slightly, in the grey area between defined orientations. Populations of cells that have same preferred value of particular stimulus, like orientation, are called a channel. Scientists are able to measure the preferred orientation of cells by recording the symmetric pattern of firing rates.

Unfortunately, a major consequence of having a limited number of cells tuned to a large number of orientations is that cells taking each measurement need to be “broadly tuned” for “coarse coading.” Cognitive psychologists wanted to know how many channels are necessary to resolve the ambiguity problem. The answer is technically two but then the cells would be so broadly tuned that you would not be establishing any type of context. In addition, the tuning curve would turn far to slow to interpret anything. Unless the input to the cell coincided with the flank of the curve, there would be very little difference between the cells outputs. Hence, the brain uses a large number of broadly tuned cells with tuning so that the most sensitive part of the tuning curve can always be representative of one orientation with the less sensitive parts being representative of slight deviations from the optimal orientation (Frisby and Stone, 2010). 

Tuning Curve

The overall relationship between orientation of input edge and the output of the cell is called the tuning curve (Frisby and Stone, 2010). Tuning curves are important because they allow you to pinpoint which cells are sensitive to what orientation. The flank mentioned in the section above is where the slope of the tuning curve is the greatest; it also represents the point of greatest change in firing rate. This peak in sensitivity is found half way from the top of the curve. The trough or top of the curve is the least sensitive part because the slop is equal to zero. Regan and Beverly (1985) proved that humans do have peaks and troughs in their orientation sensitivity.

Seeing Maps

seeing

All maps, whether they are used in the brain or not, represent a mathematical function (v = f(u)) transforming one points in space (the domain, u) to another (the codomain, v). For a map to be accurate, it must be continuous, without any breaks. Also, every pair of nearby points must correspond to two nearby points in the codomain. For example, take the cities Sheffield and Leeds. They represent two geographical points. For a map to be accurate, Sheffield and Leeds on a map must correspond proportionally in terms of distance, direction, etc. with Sheffield and Leeds in real life. The map of South Yorkshire would be the codomain and the real cities would be the domain. In the visual system, these points could be two retinal points (domain) accurately reflected onto the striatal cortex (codomain).  In addition, a map of the brain must specify direction. Direction here is not used in the common sense, direction refers to continuity between domain and codomain. For example, mapping from retina to the striate is actually discontinuous; however, mapping from each half of the retina to the striate cortex is continuous, giving rise to the retinotopic map. If you map against the specified direction, it is called inverse mapping. An example would be mapping from the cortex to the retina, which is discontinuous. This is because most nearby points in the striate cortex correspond to nearby points in the retina; however, if two striate points are located on different ocularity stripes, inverse mapping is discontinuous (Blasdel, 1992).

Scientists argue that the reason the striate cortex maintains the retinotopic map is because it economises the length of nerve fibres. Nerve fibres are necessary for inter-hypercolumn communication, and if neighbouring points were spread out, wiring would become chaotic.  Of course, mapping does not occur only in the visual system. The brain has a myriad of maps including ones for auditory, touch and motor output and many copies of these maps exist. All of these maps, like the retinotopic map must be continuous, suggesting spatial organisation is key to a healthy, functioning brain. Unfortunately, because there are so many features of the visual system that need to be represented in the map, singularities arise.

Singularities are jumps in continuity (Frisby and Stone, 2010), and they are the result of the packing problem. To put it simply, the brain wants to pack all features of the visual system into the brain; to maximise efficiency, the brain wants all similar variations of a feature in one place. In other words, the brain wants continuity. However, as was discussed with the binding problem, there are far too many variations and features of our visual system to account for them all in every possible detail. Hence, some continuity must be compromised (Hubel, 1981). The brain’s map of the visual system is continuous with respect to retinal position, but the map is discontinuous with respect to orientation. However, the brain still attempts to keep similar preferred orientations close together to maximise efficiency to the limited extent it can (Frisby and Stone, 2010).

Another way of thinking of singularities and the packing problem is in terms of parameters. Two points, as discussed above, define each retinal position otherwise known as position parameters (x, y). Correspondingly, each retinal point must correspond to a point on the cortex (x’, y’). Even though our cortex exists in three dimensions, there is still a finite amount of space to store information and parameters limit us to representing information in 2D. As orientation brings its own parameter (theta), the brain has to represent x, y and theta in 2D.  This cannot be done without introducing discontinuities in at least one of the parameters because we are limited to two parameters. As the cortex must maintain a smooth map of the retinal map, discontinuities must be introduced into the representation of orientation. Based on this information the packing problem can be redefined in terms of parameters. The packing problem arises from attempting to pack all three dimensions of a 3D parameter space into a 2D one. As a solution to the packing problem, the cortex treats two of the parameters with varying priority, in this case the domain and codomain. Low priority is given to orientation, hence singularities.

visual

Representation of point singularities in the Visual Cortex. Each color represents a different radial phase corresponding to an orientation column. Date 2 December 2011 Source Own work Author Rtang3

As singularities exist for orientation, the topological index was introduced to describe the number of singularities. Specifically, the index tells us how orientation varies as we move around the centre of a singularity (Frisby and Stone, 2010). This can be done by drawing a circle around a singularity and moving clockwise. If the underlying representation orientation changes clockwise, the singularity is positive; however, if the orientation changes anticlockwise, the singularity is negative. Singularities in the striate cortex rotate no more than 180 degrees, so the singularity variable is always between + – ½. Hypothetically, a pinwheel is a full rotation, with an index of +1. To date, now pinwheels have been found in our visual cortex. Tal and Schwartz (1997) found that for any neighbouring singularities, you could usually draw a smooth curve between them. The remaining cells form columns along the curve with the same orientation preference (iso-orientation). In addition, Tal and Schwartz confirmed that nearby singularities have the same topological indices with opposite signs.

In addition to orientation, ocularity is another parameter that has to be represented in the striate cortex. Ocularity refers to the extent in which cells respond to our eyes. The brain as ocularity stripes meaning columns in the striate alternate between monocular and binocular cells. The stripes suggest that the brain wants to ensure that pairs of L and R stripes process every part of the visual field. Unfortunately, adding another feature parameter only furthers the packing problem. Researchers have found that the brain maximised economy by ensuring that each iso-orientation domain in the orientation maps tends to cover a pair of L-R ocularity columns; in other words, each orientation is represented for each eye (Hubel and Wiesel, 1971). Furthermore, the brain needs to perceive lines of different widths. The brain has solved this problem by having cells tuned to the same orientation, sensitive to various widths of spatial frequencies. As with orientation, representation of spatial frequency is continuous except for some singularities.  Lastly, directionality is packed together with orientation. Except for sudden changes in direction (180 degrees), the direction map is continuous. It overlays the orientation map; however, it does not effect the continuity of orientation as orientations defines two possible directions (Frisby and Stone, 2010).

Fortunately, colour does not add to the packing problem! This is because colour is represented exclusively at the centre of orientation singularities. At the centre of singularities, cells have no preferred orientation, so colour does not add any parameter. To fully understand how orientation, ocularity and colour come together, the polymap was constructed to show an overlay of all the parameters. Based on observations from a polymap, in addition to the specific wiring of the brain, some scientists argue that the cortex is not really trying to solve the packing problem. All the maps try to do is to minimise the amount of wiring the brain needs to employ (Frisby and Stone, 2010). In fact, Swindale et al. 2000 found that the cortex does attempt to maximise coverage, and if any small changes were made to the current mapping system, wiring efficiency would be reduced.

 

Seeing Objects

Binocular disparity is the subtle difference between the left and right images. The left eye sees more of the scene than the right eye to the left of the centre and vice versa. Neighbouring layers responding to the left and right eye can inhibit one another when necessary. Optic nerves from the eye join at the optic chiasm and some of the fibres decussate. Optic nerves contain axons that emanate from retinal ganglion cells in the eye.  Regardless if the fibres decussate, all the fibres pass through the lateral geniculate nucleus. From the lateral geniculate nucleus, fibres feed into the striate cortex. Importantly, the striate cortex preserves the neighbourhood relations between the retinal ganglion cells. In other words, the striate cortex has a retinotopic map. The map is stretch and magnified around the fovea, which is consistent with the quality of foveal vision. Also the quality reflects the number of cells dedicated to this part of the retinal patch.

seeing

Types of Retinal Ganglion Cells

First, the midget retinal ganglion cells are the most common, making up about 80%. These cells respond to static form and project to the parvocellular layer of the lateral geniculate nucleus, specifically bilayers 3 through 6. Second, the parasol retinal ganglion cells consist of 8-10% of all retinal ganglion cells.  These cells contain on/off receptive fields and receive their light input from rods. As expected, they respond to increases and decreases in light conditions. In addition, they also respond to motion. Output from the parasol retinal ganglion cells project to the magnoceullar layer of the lateral geniculate nucleus, specifically to bilayers 1 and 2. Thirdly, the bistratified retinal ganglion cells make up less than 10% of all retinal ganglion cells. They respond to short (blue) wavelengths by increasing their frequency rate and to middle wavelengths (yellow) by decreasing their frequency rates. Output then projects to the konio sublayers 3 and 4. Lastly, the biplexiform retinal ganglion cells are equally rare making up less than 10%. The exact function of these cells in unknown, but it is known that they connect directly to rods and contain on-centre receptive field. It is believed that they provide information about ambient light.

seeing2

Features of the Lateral Geniculate Nucleus

The nucleus consists of six major layers; each layer has a major layer plus a konio cell sub-layer. Each layer is responsible for carrying information from one eye. All of the layers of one lateral geniculate nucleus receive input from half the visual space. Even though layers 2, 3 and 5 correspond with the left eye, they only receive half of the information of the retina in that eye; the other half corresponds to the right visual field. As with the striate cortex, cells in each layer are organised retinotopically. In addition, each layer encodes a different aspect of the retinal image. Each lateral geniculate nucleus contains twelve copies of half the visual field (2/bilayer). However, it is important to note that only 10% of inputs come from the retina. 30% of input comes from outputs including the striate cortex and the midbrain. Thorpe (1996) proposed that the brain uses feed forward connections of retina to lateral geniculate nucleus to striate cortex to perform “quick and dirty” analysis. Feedback then makes connections to the retinal image. Thorpe’s hypothesis is supported by his demonstration that people can make quick visual interpretations from briefly flashed images. 

The Striate Cortex (V1)

The striate cortex is responsible for early feature detection representations including colour. Stimulation of the striate cortex produces hallucinations of swirling colour (Frisby and Stone, 2010). In additional, all cells in the striate cortex have orientation-tuned columns except for layer 4B. The LGN and retinal ganglion cell are not orientation-tuned like the striate. Like the lateral geniculate nucleus, however, it is organised in layers: horizontal, vertical and retinotopy. The top layer of V1 contains pyramidal cells and their dendrites. On the other hand, the bottom layer of V1 contains pyramidal cells as they exit the cortical layer. Neurons in these layers are arranged into vertical columns with each column dedicated to one retinal patch and a specific characteristic. Retinal progress decreases the further you move from the edge of V1.

In 1978 a study was carried out by Hubel et al. to illustrate the existence of orientation-tuned cells in the striate cortex. Anesthetized macaque monkeys had their eyes exposed to a pattern of vertical stripes, continuously for 45 minutes. The stripes were of irregular width, filled the entire visual field, and moved about to activate the entire striate cortex. A chemical was then injected to be taken in by any active cells. Immediately afterwards, an autopsy was performed, which showed increased chemical uptake in the vertical columns.

Now these orientation-tuned columns are called hypercolumns. Each hypercolumn contains a mass of different types of cells that together process the same retinal patch. The patch of the retinal image that each hypercolumn deals with is called the hyperfield. Hyperfields must overlap to some degree, which allows for edge features to be detected (Frisby and Stone, 2010). As well as overlap, inter-hypercolumn communications links edge features together to create on unified image. This communication is possible due to the horizontal fibres that run along the vertical columns.  The area dedicated to a particular area remains quite constant; however, processing decreases the further you move from the centre, which is why feature detection becomes cruder the further you move into the periphery. In addition to orientation, these cells can also be tuned to colour, scales and ocularity (Frisby and Stone, 2010). The ice-cube model (Hubel and Weisel, 1962) argues that each hypercolumn functions as an image-processing mechanism.

Convultion images are used to give us an idea about the activity profile of the striate cortex, representing the output of simple cells (Frisby and Stone, 2010). Each point on a convultion image represents the response of a single simple cell, which is centered over the corresponding retinal image. White represents a large, positive output. Grey represents no output, and black represents a large negative output. Based on this scale, outputs are coded in terms of pixel grey level. Inside a hypercolumn is a pattern of activity corresponding to one area (point) of the convultion image, an area representing the hyperfield (ibid).

Complex Cell

Unlike simple cells, complex cells cannot be mapped into positive and negative regions. An optimal stimulus does not need fall on any particular region of the retinal field. However, a line or slit of a particular orientation is still the preferred stimulus. A theory of complex cells and receptive fields has been proposed (Frisby and Stone, 2010). This theory proposes that the receptive fields of complex cells can be predicted supposing they receive their input from a series of suitably placed simple cells. However, this theory cannot be true because some complex cells do not even receive input from the striate cortex.

Hypercomplex or “end topped” cells cannot be mapped into positive or negative regions either, but unlike simple and complex cells, they prefer moving stimuli. In addition, hypercomplex cells are selective to stimulus length. Of all stimuli, the best is either a bar of defined length or a corner.