Tag Archives: statistics

Heuristics Representativeness

What are Heuristics?

People rely on heuristics because they facilitate the task of assessing probabilities and predicting values; they allow us to make decisions quickly and instinctually. Although heuristics like schemas are often inaccurate, people look for evidence of that the heuristic or schema is true and ignore failures of judgment (Tversky and Kahneman, 1974). Heuristic errors are known as systematic errors, and they occur because the heuristic cannot cope with the complexity of the task. Heuristics simply lack the validity.


Representativeness is when probabilities of B are evaluated by how much it resembles A, taking for granted the degree to which A and B are related (Tversky and Kahneman, 1974). Usually, representativeness heuristics are quite accurate because if A resembles B, there is a likelihood that they are somehow related. Unfortunately, similarities can be misleading as they are influenced by factors, which need to be taken in considering when judging probability.  Factors that influence similarity include: prior probability outcomes, sample size and chance. 

– Insensitivity to Prior Probability Outcomes

A major effect on probability is base-rate frequency. For example, even though Steve has the characteristics of a librarian compared to a farmer, the fact that there are many more farmers than librarians in his population needs to be taken into account when assessing the likelihood of him having one occupation over the other. If in Steve’s population there are a lot of farmers because of the rich soil in his area, the base-rate frequency suggests that Steve is more likely to be a farmer than a librarian.

In 1973, Kahneman and Tversky conducted an experiment to show how often people overlook base-rate frequency when assessing probability outcomes. Participants were shown short personality descriptions of several individuals sampled from 100 people. The 100 people consisted of lawyers and engineers, and the experiment task was to assess which people were likely to be lawyers or engineers. In condition A, participants were given the following base rate, 70 engineers and 30 lawyers. In condition B, the base rate was 30 engineers and 70 lawyers. The two conditions produced virtually the same probability judgements despite the significantly different base rate probabilities clearly given to the participants. Participants only used the base rate probabilities when they were given without personality descriptions.

Goodie and Fantino (1996) also studied base-rate frequency. Participants were asked to determine the probability that a taxi seen by a witness was blue or green. Even though participants were give the base-rate of taxi colours in the city, participants still determined probability by the reliability of the witnesses.

– Insensitivity to Sample Size

Another major effect on probability is sample size. The similarity of sample statistic to a population parameter does not depend on size; therefore, if probabilities are assessed by representativeness, the probability is independent of sample size. Tversky and Kahneman (1974) conducted a participant to show evidence of insensitivity to sample size. Participants were given the following information:

–       There are two hospitals in a town, one small and one larger

–       About 45 babies are born per day at the large hospital

–       About 15 babies are born per day at the small one

–       50% of the babies born are boys, but this figure differs slightly everyday

Participants were then asked which hospital is more likely to report a day with 60 male births. Most of the participants answered that the hospitals are equally likely to have 60 births. However, sample theory warrants that the small hospital would be more likely because the larger hospital is less likely to stray from the mean.


Unfortunately, the general public are not the only ones to fall victim to sample size. In 1971, Tversky and Kahneman conducted a meta-analysis on experienced research psychology, the majority stood by the validity of small group sizes. The researches simply put too much faith in their results from small sample sizes, underestimating the high chance of representativeness. It is likely that the reason the benefits of a large sample size are drilled into psychology students from day one is to avoid errors like this.

– Misconceptions of Chance

People expect that a randomly generated sequence of events will represent the essential characteristics of that process even when the sequence is short. In other words, people thing that the likelihood of getting H-T-H-T-T-H is more likely than H-H-H-H-H-H even though they are equally likely. This because every T or H has to be assessed as an individual probability event. In other words, in trail one, you have a 50% chance of getting a T or an H. In the second trail, the results of the first have no impact; therefore, you again have a 50% chance to get either letter. Probability matching is another word for this misconception of chance. Andrade and May (2004) describe another scenario based on real life misconceptions of chance.

First, participants are given a jar of 100 balls and told that 80% are white and 20% are white. The most commonly observed strategy when asked what ball colour will become next is one that imitates the pattern of 20% white and 20% red. In reality, the most efficient strategy is to say red for every draw because the probability event, as stated above, needs to be assessed for each individual draw not for the task as a whole. The implications of probability in gambling are huge, so it is not surprising that the gambler’s fallacy is another name for probability matching. People simply believe in the “law of averages,” that if an event has occurred less often that probability suggests, it is more likely to occur in the future.

– Insensitivity to Predictability

Another issue with probability is insensitivity to predictability, which is when people ignore the reliability of a description and instead pay attention to in related factors. For example, a person will pay more attention to a particular review and given greater reliability if the person’s name is the same as yours. Another example would be ignoring negative reviews and only paying attention to positive ones because they confirm your own belief. Obviously, doing so means disregarding the reliability of the evidence.


Tversky and Kahneman conducted an experiment in 1973 in which participants were given several descriptions of the performance of student teacher during a particular lesson. Some participants were asked to evaluate the quality of lesson described into percentile scores, and other participants were asked to predict the standing of the student teacher five years after the practice lessons. The judgments of the second group were based on the other participants’ evaluations. Even though the participants were aware of the limited predictability of judging a person’s performance five years into the future, they expressed high confidence in judging the student teacher’s performance to be identical to now. Sadly, high confidence in the face of poor judgment of probability is common and known as the illusion of validity. Confidence displayed by people in their predictions usually depends on representativeness and regard for other factor is usually ignored; the illusion persists even when a person is aware of the limited accuracy of prediction (ibid).

–  Misconceptions of Regression

People simply do expect regression to occur even in contexts where it is common (Tversky and Kahneman, 1974). A good example of regression towards the mean is with height; two above average height parents are more likely to have a child of average height than above average height. Despite this fact, people tend to dismiss regression because it is inconsistent with their beliefs. Failure to accept the truth; however, leads to overestimation of the effectiveness of punishment and the underestimation of the effectiveness of reward.

– Implicature

Implicature refers to what a sentence suggests rather than what is literally said (Blackburn, 1996). For example, the sentence “I showered and went to bed” implies that first I showered and then I went to bed; however, if you take the sentence literally, I could mean I went to bed and then I showered in the morning. Both possibilities are true, but given the context it would be strange for me not mean that I showered before going to sleep. Sometimes a qualification is added, which adds new information to the context. Even if the sentence is not altered itself, a qualification clarifies our implication. An example of a qualification would, in that exact order: “I showered and went to bed, in that exact order.”

– The Conjunction Fallacy

The conjunction fallacy, first outlined by Tversky and Kahneman in 1983, refers to tendency to believe that two events are more likely to occur together rather than independently. The example provided by Tversky and Kahneman is as follows:

Linda is a bank teller question is a good example:

“Linda is 31 years old, single, outspoken, and very bright. She studied philosophy at University. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-war demonstrations.

Which is more likely?

A) Linda is a Bank Teller

B) Linda is a Bank Teller and is active in the feminist movement.”

The results showed that people chose B more often than A because they see the information above as indicative of Linda’s personality, and B seems like a better fit for her as a person. The truth is than A and B do not have the same likelihood because there is no way of knowing if she is a feminist or not. Linda is a bank teller is obviously a fact. Regardless of that, the general public as well as statistical experts still rely on the representative heuristic, ignoring the probability a play.


People tend be overly confident in their estimates even when they are pointed out the irrationality of their thinking (Fischhoff et al. 1977). Baron (1994) found evidence that one reason for our inappropriately high confidence our tendency not to search for reasons why we might be wrong. As ridiculous as this may seem, studies consistently confirm that people ignore new information to hold on to their original belief. Weinstein (1989) reported a study where racetrack handicappers were slowly given more and more information about the outcomes of the race. Despite becoming more informed, the participants held onto their original belief with more confidence. DeBondt and Thaler (1986) propose that when new information arrives, investors revise their beliefs by overweighting the new information and underweighting the earlier information.


Normal or Gaussian Distribution

Characteristics of Normal Distribution 

– Symmetrical about the mean

– Tails should meet the axis at infinity

– Bell-shaped distribution

– Mean = mode = median

– The area under the curve is 1 standard deviation away from the mean and makes up 68% of the entire distribution under the curve (This means that if you randomly select a point under the curve, there is a 68% chance it will fall one standard deviation from the mean)

– The area under the curve 1.96 SD (round to two) away from the mean makes up 95% of the entire distribution under the curve (This means that if you randomly select a point under the curve, there is a 95% chance it will fall 2 standard deviations from the mean)

– The sample mean = mean of the population

– The standard deviation of the mean distribution or standard error = (SD of the population)/(square root of the number of scores)

– The standard error indicates the degree to which sample means deviate from the mean

– The sample mean distribution converges to normal distribution as the size of the sample increases

– The bell-shaped curve can also be reflected in the lay-out of a histrogram


Here the SD is 15 units


Questions Dealing with Standard Deviation

Question: Assume the standard deviation is 10 and the mean score is 100. If you randomly select any point 1 standard deviation from the mean, what would be your range?

Answer: The range would be between 90 and 110. As one standard deviation is 10 units left or right. You could also say that you have a 68% chance of randomly picking a score between 90 and 110 on the this graph.

Question: Assume the standard deviation is 10 and the mean score is 100. If you randomly select any point 2 standard deviations from the mean, what would your range be?

Answer: The range would be between 80 and 120. As one standard deviation is 10 units left or right, 2 standard deviations would be 20 units left or right. You could also say that you have a 95% of randomly picking a score between 80 and 120 on this graph.

N.B: 95% is the commonly accepted probability, which is the alpha level or confidence level in psychological studies for rejecting the null hypothesis is p<0.05.

The z-Score 

It is possible to convert all normal distributions to the standard normal distribution.

For a standard normal distribution the mean has to equal 0 and the SD has to equal 1.

You can find the z-score by subtracting the mean from each data point, and then dividing the this zero-meaned data by the standard deviation.

If your final data point is +1, this point is one standard deviation above the mean. If your final data point is -3, this point is 3 standard deviations below the mean. The z-score is particularly useful for comparing data across different situations.

Error Bar Charts

Error bar charts are away of representing the confidence interval. Error bars display your mean means as a point on a chart and a vertical line through the mean point that represents the confidence interval. The longer the line, the longer the confidence interval. Error bar charts can also be used to see if two population means differ from each other by comparing confidence interval. If the confidence intervals do not overlap we can be 95% confident that both population means fall within the intervals indicated and therefore do not overlap.



ZHENG, Y. (2013). Referencing and citation – Harvard style, from PSY104 Methods and Reasoning for Psychologists. University of Sheffield, Richard Roberts Building on 11th February. Available from: Blackboard.
[Accessed 4/02/13].

The Statistical t-Test


The statistical t-test is used to compare two conditions, specifically the means of two conditions. A t-test can be applied to both a between participants and within participants design. This test can only be done on normally distributed data, and as such is a parametric test. The purpose of the t-test is to decide whether or not the difference between the means of the two conditions is statistically significant. If the difference is statistically significant we are able to except our experimental hypothesis and also give some directionality to our hypothesis. If the difference between the means of the two conditions is not statistically significant we must reject our experimental hypothesis and accept the null hypothesis. The t-score is technically more than just the difference between the means. Just like normal data distribution, the t-score also has a 95% confidence interval, which means for the difference between the means to be statistically significant, the alpha level needs to be less than 0.05. The alpha level was decided on 0.05 to try and reduce the amount of type I and type II errors. Type I errors is when we reject the null hypothesis but should not have, and type II errors is when we reject the experimental hypothesis but we should not have. If the sample size is large and the null hypothesis is true, the distribution of the t-scores is also normal. The smaller the sample size becomes, the more tail-heavy the distribution becomes.

The way this is interpreted is if two groups come from the same population,  then 95% of the time, the t-score (reflecting the difference in the means) will be within the 95% area under the graph of the data.

Degrees of Freedom

Degrees of freedom for within-participants design is the same as the number of participants.

Degrees of freedom for between-participants design is the (number of participants in group 1 -1) + (number of Ps in group 2 -1)

SPSS will do the math for you!


When you start with your mean scores, assume that the null hypothesis is true and that there is no significant difference between the means.

Then set your significance level at p<0.05 or the alpha level, which is the same thing. SPSS should do this automatically.

Then using SPSS calculate the t-score.

If the t-score is within the 95% interval: accept the null hypothesis and reject the experimental hypothesis.

If the t-score is outside the 95% interval: reject the null hypothesis and accept the experimental hypothesis. You have now established that there is a significant difference between the two means.


This is an example of the type of output that will be given by SPSS. From this output you can answer the following questions:

Question 1: Is the experimental design within or between participants?

Answer: The experimental design is within. You can tell this from the heading where it says paired difference.

Question 2: What is the t-score?

Answer: The t-score is -9.60.

Question 3: What are the degrees of freedom?

Answer: The degrees of freedom (df) is 77.

Question 4: Is it two-tailed or one-tailed test?

Answer: It is two tailed as shown in the last box. Sig. (2-tailed).

Question 5: Is the result significant at an alpha level of 0.05? Why?

Answer: The result is significant at the alpha level because p<0.001, which obviously is less than 0.05.

 Reporting the Results 

This is an example of how you would report the following data for the results section of a lab report:

The mean and standard deviation of participants’ reaction time under conditions 1 and 2 are given in Table (not in this post). The data were analysed using a two-tailed within-participants t-test and an alpha level of 0.05.There is a statistically significant difference between the ideal IQ and the estimated IQ, with the estimated IQ significantly lower than IQ for an ideal job, t(77) = -9.60, p <0.001.

Dispersion and Central Tendency

I think people underestimate the amount of statistics that is necessary for psychological research. Luckily, as long as you understand the theory behind the statistics, most of the math is done by a computer. For my undergraduate course we use a programme called SPSS, which you can buy of amazon but most universities supply for a reduced price for their students. This post will be an introduction to the basics of statistical theory used in psychological research.



Levels of Measurement

Nominal measurement is the lowest level of measurement, and includes categorical data and measures of frequencies.

Ordinal measurement involves rating scales to measure participant responses.

Interval measurement involves equal intervals; for example, measuring temperature. Interval measurement has no absolute zero worth.

Ratio measurement involves intervals with an absolute zero worth.

N.B: parametric tests can only be used with interval or ratio measurement unless you convert the data into numerical values.

Types of Data Seen in Psychological Research

Continuous numerical data is data that can take any value within a certain range. The issue with continuous numerical data is that it is heavily dependent on the accuracy of measuring instrument.

For example: height, weight, reaction time

Discrete numerical data is data that can only take a specific value within a certain range. Questions that involve how many of something or the presence or absence of data is usually dealing with discrete numerical data.

For example: Numerical scores on a questionnaire: how many times have you been oversees?

Categorical data does not deal with a specific numerical value, but rather what group variables can be placed into. The issue with categorical data is that it can be too extreme. The people under one label can be very different from each other. Plus, it is very difficult to make appropriate intervals for categorical data.

For example: gender, nationality, etc.Categorical data can also come from continuos or discrete variables


Types of Statistics

Descriptive statistics summarise the properties of a sample of data usually through measures of central tendency and dispersion. Measures of central tendency include: mean, median and mode. Dispersion refers to the spread of data, providing information about the mean accuracy. Different measures of dispersion include: range, variance, and standard deviation.

Inferential statistics use the properties found from the descriptive statistics to make estimations of of the properties of the population.

Central Tendency 

The mean is the average score. The mode is the most frequently occurring score, and the median is the middle score (when points are organised from lowest to highest value). The mean is the preferred measure of central tendency because it takes into account all the data of the population. The problem with using the mean, however, is that is easily influenced by extreme scores.


Dispersion measured the spread of data and as mentioned above, gives the mean accuracy.

The most common approach for calculating the deviance is by calculating the variance. The variance is the: (sum of the squared deviances)/(the number of observations – 1). The unit is always the square of the measurement unit. The variances indicates how much the scores differ from one another.

N.B: The squared deviances is found by calculating the difference between each observation and the mean, and then squaring this value.

Standard deviation is the square root of the variance, and is much easier to deal with because it does not have squared units like the variance.



ZHENG, Y. (2013). Referencing and citation – Harvard style, from PSY104 Methods and Reasoning for Psychologists. University of Sheffield, Richard Roberts Building on 4th February. Available from: Blackboard.
[Accessed 4/02/13].