by David M. Lane
Chapter 5: Basic Concepts
1. Compute the probability of a condition from hits, false alarms, and base rates
using a tree diagram
2. Compute the probability of a condition from hits, false alarms, and base rates
using Bayes' Theorem
Suppose that at your regular physical exam you test positive for Disease X.
Although Disease X has only mild symptoms, you are concerned and ask your
doctor about the accuracy of the test. It turns out that the test is 95% accurate. It
would appear that the probability that you have Disease X is therefore 0.95.
However, the situation is not that simple.
For one thing, more information about the accuracy of the test is needed
because there are two kinds of errors the test can make: misses and false positives.
If you actually have Disease X and the test failed to detect it, that would be a miss.
If you did not have Disease X and the test indicated you did, that would be a false
positive. The miss and false positive rates are not necessarily the same. For
example, suppose that the test accurately indicates the disease in 99% of the people
who have it and accurately indicates no disease in 91% of the people who do not
have it. In other words, the test has a miss rate of 0.01 and a false positive rate of
0.09. This might lead you to revise your judgment and conclude that your chance
of having the disease is 0.91. This would not be correct since the probability
depends on the proportion of people having the disease. This proportion is called
the base rate.
Assume that Disease X is a rare disease, and only 2% of people in your
situation have it. How does that affect the probability that you have it? Or, more
generally, what is the probability that someone who tests positive actually has the
disease? Let's consider what would happen if one million people were tested. Out
of these one million people, 2% or 20,000 people would have the disease. Of these
20,000 with the disease, the test would accurately detect it in 99% of them. This
means that 19,800 cases would be accurately identiﬁed. Now let's consider the
98% of the one million people (980,000) who do not have the disease. Since the
false positive rate is 0.09, 9% of these 980,000 people will test positive for the
disease. This is a total of 88,200 people incorrectly diagnosed.
To sum up, 19,800 people who tested positive would actually have the
disease and 88,200 people who tested positive would not have the disease. This
means that of all those who tested positive, only
19,800/(19,800 + 88,200) = 0.1833
of them would actually have the disease. So the probability that you have the
disease is not 0.95, or 0.91, but only 0.1833.
These results are summarized in Table 1. The numbers of people diagnosed
with the disease are shown in red. Of the one million people tested, the test was
correct for 891,800 of those without the disease and for 19,800 with the disease;
the test was correct 91% of the time. However, if you look only at the people
testing positive (shown in red), only 19,800 (0.1833) of the 88,200 + 19,800 =
108,000 testing positive actually have the disease.
Table 1. Diagnosing Disease X.
This same result can be obtained using Bayes' theorem. Bayes' theorem considers
both the prior probability of an event and the diagnostic value of a test to determine
the posterior probability of the event. For the current example, the event is that you
have Disease X. Let's call this Event D. Since only 2% of people in your situation
have Disease X, the prior probability of Event D is 0.02. Or, more formally, P(D) =
0.02. If D' represents the probability that Event D is false, then P(D') = 1 - P(D) =
To deﬁne the diagnostic value of the test, we need to deﬁne another event:
that you test positive for Disease X. Let's call this Event T. The diagnostic value of
the test depends on the probability you will test positive given that you actually
have the disease, written as P(T|D), and the probability you test positive given that
you do not have the disease, written as P(T|D'). Bayes' theorem shown below
allows you to calculate P(D|T), the probability that you have the disease given that
you test positive for it.
( | | ) ) ( ( )
The various terms are:
P(T|D) = 0.99
P(T|D') = 0.09
P(D) = 0.02
P(D') = 0.98
which is the same value computed previously.
by David M. Lane
• Chapter 5: Base Rates
gives the FBI list of warning signs for school shooters.
What do you think?
Do you think it is likely that someone showing a majority of these signs would
actually shoot people in school?
Fortunately the vast majority of students do not become
shooters. It is necessary to take this base rate information into
account in order to compute the probability that any given
student will be a shooter. The warning signs are unlikely to be
sufﬁciently predictive to warrant the conclusion that a student
will become a shooter. If an action is taken on the basis of these
warning signs, it is likely that the student involved would never
have become a shooter even without the action.
All material presented inthe Probability Chapter
1. (a) What is the probability of rolling a pair of dice and obtaining a total score of
9 or more? (b) What is the probability of rolling a pair of dice and obtaining a
total score of 7?
2. A box contains four black pieces of cloth, two striped pieces, and six dotted
pieces. A piece is selected randomly and then placed back in the box. A second
piece is selected randomly. What is the probability that:
a. both pieces are dotted?
b. the ﬁrst piece is black and the second piece is dotted?
c. one piece is black and one piece is striped?
3. A card is drawn at random from a deck. (a) What is the probability that it is an
ace or a king? (b) What is the probability that it is either a red card or a black
4. The probability that you will win a game is 0.45. (a) If you play the game 80
times, what is the most likely number of wins? (b) What are the mean and
variance of a binomial distribution with p = 0.45 and N = 80?
5. A fair coin is ﬂipped 9 times. What is the probability of getting exactly 6 heads?
6.When Susan and Jessica play a card game, Susan wins 60% of the time. If they
play 9 games, what is the probability that Jessica will have won more games than
7.You ﬂip a coin three times. (a) What is the probability of getting heads on only
one of your ﬂips? (b) What is the probability of getting heads on at least one ﬂip?
8. A test correctly identiﬁes a disease in 95% of people who have it. It correctly
identiﬁes no disease in 94% of people who do not have it. In the population, 3%
of the people have the disease. What is the probability that you have the disease
if you tested positive?
9. A jar contains 10 blue marbles, 5 red marbles, 4 green marbles, and 1 yellow
marble. Two marbles are chosen (without replacement). (a) What is the
probability that one will be green and the other red? (b) What is the probability
that one will be blue and the other yellow?
10. You roll a fair die ﬁve times, and you get a 6 each time. What is the probability
that you get a 6 on the next roll?
11. You win a game if you roll a die and get a 2 or a 5. You play this game 60
a. What is the probability that you win between 5 and 10 times (inclusive)?
b. What is the probability that you will win the game at least 15 times?
c. What is the probability that you will win the game at least 40 times?
d. What is the most likely number of wins.
e. What is the probability of obtaining the number of wins in d?
Explain how you got each answer or show your work.
12. In a baseball game, Tommy gets a hit 30% of the time when facing this pitcher.
Joey gets a hit 25% of the time. They are both coming up to bat this inning.
a. What is the probability that Joey or Tommy will get a hit?
b. What is the probability that neither player gets a hit?
c. What is the probability that they both get a hit?
13. An unfair coin has a probability of coming up heads of 0.65. The coin is ﬂipped
50 times. What is the probability it will come up heads 25 or fewer times?
(Give answer to at least 3 decimal places).
14.You draw two cards from a deck, what is the probability that:
a. both of them are face cards (king, queen, or jack)?
b. you draw two cards from a deck and both of them are hearts?
15. True/False: You are more likely to get a pattern of HTHHHTHTTH than
HHHHHHHHTT when you ﬂip a coin 10 times.
16. True/False: Suppose that at your regular physical exam you test positive for a
relatively rare disease. You will need to start taking medicine if you have the
disease, so you ask your doc- tor about the accuracy of the test. It turns out that
the test is 98% accurate. The probability that you have Disease X is therefore
0.98 and the probability that you do not have it is .02. Explain your answer.
Questions from Case Studies
Diet and Health (DH) case study
a. What percentage of people on the AHA diet had some sort of illness or
b. What is the probability that if you randomly selected a person on the AHA
diet, he or she would have some sort of illness or death?
c. If 3 people on the AHA diet are chosen at random, what is the probability
that they will all be healthy?
a. What percentage of people on the Mediterranean diet had some sort of
illness or death?
b. What is the probability that if you randomly selected a person on the
Mediterranean diet, he or she would have some sort of illness or death?
c. What is the probability that if you randomly selected a person on the
Mediterranean diet, he or she would have cancer?
d. If you randomly select ﬁve people from the Mediterranean diet, what is the
probability that they would all be healthy?
The following questions are from ARTIST (reproduced with permission)
19. Five faces of a fair die are painted black, and one face is painted white. The die
is rolled six times. Which of the following results is more likely?
a. Black side up on ﬁve of the rolls; white side up on the other roll
b. Black side up on all six rolls
c. a and b are equally likely
20. One of the items on the student survey for an introductory statistics course was
“Rate your intelligence on a scale of 1 to 10.” The distribution of this variable
for the 100 women in the class is presented below. What is the probability of
randomly selecting a woman from the class who has an intelligence rating that
is LESS than seven (7)?
a. (12 + 24)/100 = .36
b. (12 + 24 + 38)/100 = .74
c. 38/100 = .38
d. (23 + 2 + 1)/100 = .26
e. None of the above.
21. You roll 2 fair six-sided dice. Which of the following outcomes is most likely
to occur on the next roll? A. Getting double 3. B. Getting a 3 and a 4. C. They
are equally likely. Explain your choice.
22. If Tahnee ﬂips a coin 10 times, and records the results (Heads or Tails), which
outcome below is more likely to occur, A or B? Explain your choice.
23. A bowl has 100 wrapped hard candies in it. 20 are yellow, 50 are red, and 30
are blue. They are well mixed up in the bowl. Jenny pulls out a handful of 10
candies, counts the number of reds, and tells her teacher. The teacher writes the
number of red candies on a list. Then, Jenny puts the candies back into the
bowl, and mixes them all up again. Four of Jenny’s classmates, Jack, Julie,
Jason, and Jerry do the same thing. They each pick ten candies, count the reds,
and the teacher writes down the number of reds. Then they put the candies
back and mix them up again each time. The teacher’s list for the number of
reds is most likely to be (please select one):
24. An insurance company writes policies for a large number of newly-licensed
drivers each year. Suppose 40% of these are low-risk drivers, 40% are
moderate risk, and 20% are high risk. The company has no way to know which
group any individual driver falls in when it writes the policies. None of the
low-risk drivers will have an at-fault accident in the next year, but 10% of the
moderate-risk and 20% of the high-risk drivers will have such an accident. If a
driver has an at-fault accident in the next year, what is the probability that he or
she is high-risk?
25. You are to participate in an exam for which you had no chance to study, and for
that reason cannot do anything but guess for each question (all questions being
of the multiple choice type, so the chance of guessing the correct answer for
each question is 1/d, d being the number of options (distractors) per question;
Documents you may be interested
Documents you may be interested