Animportantcharacteristicofanysetofdataisthevariationinthedata.Insomedatasets,thedatavalues
Thestandarddeviationisanumberthatmeasureshowfardatavaluesarefromtheirmean.
Thestandarddeviation
 canbeusedtodeterminewhetheraparticulardatavalueisclosetoorfarfromthemean
Thestandarddeviationisalwayspositive or 0. . The e standarddeviationissmallwhenthedataare all
Suppose that we are studyingwaitingtimesat the checkout linefor customersatsupermarket A and
supermarketB;theaveragewaittimeatbothmarketsis5minutes. AtmarketA,thestandarddeviation
forthewaitingtimeis2minutes;atmarketBthestandarddeviationforthewaitingtimeis4minutes.
Because market Bhasahigher standarddeviation, , weknowthat t there ismore variation n inthe e wait-
marketAaremoreconcentratedneartheaverage.
SupposethatRosaandBinhbothshopatMarketA.Rosawaitsfor7minutesandBinhwaitsfor1minute
atthecheckoutcounter. Atmarket t A,themeanwaittimeis5minutesandthestandarddeviationis2
mean.
Rosawaitsfor7minutes:
 7is2minuteslongerthantheaverageof5;2minutesisequaltoonestandarddeviation.
 Rosa’swaittimeof7minutesis2minuteslongerthantheaverageof5minutes.
 Rosa’swaittimeof7minutesisonestandarddeviationabovetheaverageof5minutes.
Binhwaitsfor1minute.
 1is4minuteslessthantheaverageof5;4minutesisequaltotwostandarddeviations.
 Binh’swaittimeof1minuteis4minuteslessthantheaverageof5minutes.
 Binh’swaittimeof1minuteistwostandarddeviationsbelowtheaverageof5minutes.
statisticianswouldconsidertobefarfromtheaverage.Consideringdatatobefarfromthemeanifit
ismorethan2standarddeviationsawayismoreofanapproximate"ruleofthumb"thanarigidrule.
Ingeneral,theshapeofthedistributionofthedataaffectshowmuchofthedataisfurtherawaythan
Thenumberlinemayhelpyouunderstandstandarddeviation.Ifweweretoput5and7onanumberline,
7istotherightof5.Wesay,then,that7isonestandarddeviationtotherightof5because
+ (1)(2) = 7.
82
CHAPTER2. DESCRIPTIVESTATISTICS
If1werealsopartofthedataset,then1istwostandarddeviationstotheleftof5because
+ ( 2)(2) = 1.
 Ingeneral,a value=mean+(#ofSTDEV)(standarddeviation)
 where#ofSTDEVs=thenumberofstandarddeviations
 7isonestandarddeviationmorethanthemeanof5because:7=5+(1)(2)
 1istwostandarddeviationslessthanthemeanof5because:1=5+( 2)(2)
Theequation value=mean+(#ofSTDEVs)(standarddeviation)canbeexpressedforasampleandfora
population:
 sample:x=
x+(#ofSTDEV)(s)
 Population:x=m+
(#ofSTDEV)(s)
ThelowercaselettersrepresentsthesamplestandarddeviationandtheGreekletters(sigma,lowercase)
representsthepopulationstandarddeviation.
Thesymbol
xisthesamplemeanandtheGreeksymbolmisthepopulationmean.
CalculatingtheStandardDeviation
deviationsasthereareitemsinthedataset. Thedeviationsareusedtocalculatethestandarddeviation.
deviationisx
x.
Theproceduretocalculatethestandarddeviationdependsonwhetherthenumbersaretheentirepopula-
tionoraredatafromasample. Thecalculationsaresimilar,butnotidentical. Thereforethesymbolused
torepresentthestandarddeviationdependsonwhetheritiscalculatedfromapopulationorasample.
ThelowercaselettersrepresentsthesamplestandarddeviationandtheGreekletters(sigma,lowercase)
representsthepopulationstandarddeviation.Ifthesamplehasthesamecharacteristicsasthepopulation,
thensshouldbeagoodestimateofs.
Tocalculatethestandarddeviation,weneedtocalculatethevarianceﬁrst. Thevarianceisanaverageof
thesquaresofthedeviations(thex
xvaluesforasample,orthex mvaluesforapopulation). The
symbols
2
representsthepopulationvariance; thepopulationstandarddeviationsisthesquarerootof
thepopulationvariance.Thesymbols2representsthesamplevariance;thesamplestandarddeviationsis
thesquarerootofthesamplevariance.Youcanthinkofthestandarddeviationasaspecialaverageofthe
deviations.
Ifthenumberscomefromacensusoftheentirepopulationandnotasample,whenwecalculatetheaver-
ageofthesquareddeviationstoﬁndthevariance,wedividebyN,thenumberofitemsinthepopulation.
Ifthedataarefromasampleratherthanapopulation,whenwecalculatetheaverageofthesquareddevi-
ations,wedividebyn-1,onelessthanthenumberofitemsinthesample.Youcanseethatintheformulas
below.
83
FormulasfortheSampleStandardDeviation
 s=
q
S(x
x)
2
1
ors=
q
Sf(x
x)
2
1
 Forthesamplestandarddeviation,thedenominatorisn-1,thatisthesamplesizeMINUS1.
FormulasforthePopulationStandardDeviation
 s=
q
S(x
m)
2
N
ors=
q
Sf(x
m)
2
N
 Forthepopulationstandarddeviation,thedenominatorisN,thenumberofitemsinthepopulation.
Intheseformulas, frepresentsthefrequencywithwhichavalueappears.Forexample,ifavalueappears
once, fis1.Ifavalueappearsthreetimesinthedatasetorpopulation,fis3.
SamplingVariabilityofaStatistic
ThestatisticofasamplingdistributionwasdiscussedinDescriptiveStatistics:MeasuringtheCenterof
theData.Howmuchthestatisticvariesfromonesampletoanotherisknownasthesamplingvariabilityof
astatistic.Youtypicallymeasurethesamplingvariabilityofastatisticbyitsstandarderror.Thestandard
errorofthemeanisanexampleofastandarderror.Itisaspecialstandarddeviationandisknownasthe
standarddeviationofthesamplingdistributionofthemean.Youwillcoverthestandarderrorofthemean
inTheCentralLimitTheorem(notnow).Thenotationforthestandarderrorofthemeanis
s
p
n
wheresis
thestandarddeviationofthepopulationandnisthesizeofthesample.
NOTE
Inpractice, USE ACALCULATORORCOMPUTERSOFTWARETOCALCULATE
THESTANDARDDEVIATION.IfyouareusingaTI-83,83+,84+calculator,youneedtoselect
theappropriatestandarddeviations
x
ors
x
fromthesummarystatistics.Wewillconcentrateon
usingandinterpretingtheinformationthatthestandarddeviationgivesus.Howeveryoushould
studythefollowingstep-by-stepexampleto helpyouunderstandhowthe standarddeviation
measuresvariationfromthemean.
Example2.21
Inaﬁfthgrade class, the teacher wasinterestedinthe average age andthe samplestandard
deviationoftheagesofherstudents.ThefollowingdataaretheagesforaSAMPLEofn=20ﬁfth
9;9.5;9.5;10;10;10;10;10.5;10.5;10.5;10.5;11;11;11;11;11;11;11.5;11.5;11.5
x=
9+9.52+104+10.54+116+11.53
20
=10.525
(2.8)
Theaverageageis10.53years,roundedto2places.
Thevariancemaybecalculatedbyusingatable. Thenthestandarddeviationiscalculatedby
takingthesquarerootofthevariance.Wewillexplainthepartsofthetableaftercalculatings.
Data
Freq.
Deviations
Deviations
2
(Freq.)(Deviations
2
)
x
f
(x
x)
(x
x)
2
(f)(x
x)
2
9
1
9 10.5251.525
(
1.525
)
2
=2.325625
12.325625=2.325625
9.5
2
9.5 10.5251.025
1.025)
2
=1.050625
21.050625=2.101250
10
4
10 10.5250.525
0.525)
2
=0.275625
4.275625=1.1025
10.5
4
10.5 10.5250.025
0.025)
2
=0.000625
4.000625=.0025
11
6
11 10.525=0.475
(0.475)
2
=0.225625
6.225625=1.35375
11.5
3
11.5 10.525=0.975
(0.975)
2
=0.950625
3.950625=2.851875
84
CHAPTER2. DESCRIPTIVESTATISTICS
Table2.7
Thesamplevariance,s2,isequaltothesumofthelastcolumn(9.7375)dividedbythetotalnumber
ofdatavaluesminusone(20-1):
s
2
=
9.7375
20 1
=0.5125
Thesamplestandarddeviationsisequaltothesquarerootofthesamplevariance:
s=
p
0.5125=.0715891Roundedtotwodecimalplaces,s=0.72
Typically,youdothecalculationforthestandarddeviationonyourcalculatororcomputer.The
intermediateresultsarenotrounded.Thisisdoneforaccuracy.
Problem1
Verifythemeanandstandarddeviationcalculatedaboveonyourcalculatororcomputer.
Solution
UsingtheTI-83,83+,84+Calculators
• Enterdataintothelisteditor.PressSTAT1:EDIT.Ifnecessary,clearthelistsbyarrowingup
intothename.PressCLEARandarrowdown.
• Putthedatavalues(9,9.5,10,10.5,11,11.5)intolistL1andthefrequencies(1,2,4,4,6,3)
intolistL2.Usethearrowkeystomovearound.
• PressSTATandarrowtoCALC.Press1:1-VarStatsandenterL1(2nd1),L2(2nd2). . Donot
forgetthecomma.PressENTER.
x=10.525
• UseSxbecausethisissampledata(notapopulation):Sx=0.715891
 Forthefollowingproblems,recallthatvalue=mean+(#ofSTDEVs)(standarddeviation)
 Forasample:x=
x+(#ofSTDEVs)(s)
 Forapopulation:x=m+(#ofSTDEVs)(s)
 Forthisexample,usex=
x+(#ofSTDEVs)(s)becausethedataisfromasample
Problem2
Findthevaluethatis1standarddeviationabovethemean.Find(
x+1s).
Solution
(
x+1s)=10.53+(1)(0.72)=11.25
Problem3
Findthevaluethatistwostandarddeviationsbelowthemean.Find(
2s).
Solution
(
2s)=10.53 (2)(0.72)=9.09
Problem4
Findthevaluesthatare1.5standarddeviationsfrom(belowandabove)themean.
Solution
 (
1.5s)=10.53 (1.5)(0.72)=9.45
85
 (
x+1.5s)=10.53+(1.5)(0.72)=11.61
Explanationofthestandarddeviationcalculationshowninthetable
meanthanisthedatavalue11.Thedeviations0.97and0.47indicatethat.Apositivedeviationoccurswhen
thedatavalueisgreaterthanthemean. Anegativedeviationoccurswhenthedatavalueislessthanthe
data.Bysquaringthedeviations,youmakethempositivenumbers,andthesumwillalsobepositive.The
variance,then,istheaveragesquareddeviation.
Thevarianceisasquaredmeasureanddoesnothavethesameunitsasthedata. Takingthesquareroot
ple. Forthe samplevariance,wedividebythesamplesizeminusone(1).Whynotdividebyn?The
ance. Basedonthetheoreticalmathematicsthatliesbehindthesecalculations,dividingby(1)givesa
betterestimateofthepopulationvariance.
NOTE
calculatororcomputerdothearithmetic.
Thestandarddeviation,sors,iseitherzeroorlargerthanzero.Whenthestandarddeviationis0,thereis
dataareallconcentratedclosetothemean,andislargerwhenthedatavaluesshowmorevariationfrom
themean;outlierscanmakesorsverylarge.
Thestandarddeviation,whenﬁrstpresented, canseemunclear. . Bygraphingyour r data, youcangeta
better"feel"forthedeviationsandthestandarddeviation.Youwillﬁndthatinsymmetricaldistributions,
distribution,itisbettertolookattheﬁrstquartile,themedian,thethirdquartile,thesmallestvalue,and
thelargestvalue.Becausenumberscanbeconfusing,alwaysgraphyourdata.
NOTE
:Theformulaforthestandarddeviationisattheendofthechapter.
Example2.22
Usethefollowingdata(ﬁrstexamscores)fromSusanDean’sspringpre-calculusclass:
33;42;49;49;53;55;55;61;63;67;68;68;69;69;72;73;74;78;80;83;88;88;88;90;92;94;94;94;94;
96;100
a. Createachartcontainingthedata,frequencies,relativefrequencies,andcumulativerelative
frequenciestothreedecimalplaces.
b. CalculatethefollowingtoonedecimalplaceusingaTI-83+orTI-84calculator:
i. Thesamplemean
ii. Thesamplestandarddeviation
iii. Themedian
86
CHAPTER2. DESCRIPTIVESTATISTICS
iv. Theﬁrstquartile
v. Thethirdquartile
vi. IQR
plot,thehistogram,andthechart.
Solution
a.
Data
Frequency
RelativeFrequency
CumulativeRelativeFrequency
33
1
0.032
0.032
42
1
0.032
0.064
49
2
0.065
0.129
53
1
0.032
0.161
55
2
0.065
0.226
61
1
0.032
0.258
63
1
0.032
0.29
67
1
0.032
0.322
68
2
0.065
0.387
69
2
0.065
0.452
72
1
0.032
0.484
73
1
0.032
0.516
74
1
0.032
0.548
78
1
0.032
0.580
80
1
0.032
0.612
83
1
0.032
0.644
88
3
0.097
0.741
90
1
0.032
0.773
92
1
0.032
0.805
94
4
0.129
0.934
96
1
0.032
0.966
100
1
0.032
0.998(Whyisn’tthisvalue1?)
Table2.8
b. i. Thesamplemean=73.5
ii. Thesamplestandarddeviation=17.9
iii. Themedian=73
iv. Theﬁrstquartile=61
v. Thethirdquartile=90
vi. IQR=90-61=29
c. Thex-axisgoesfrom32.5to100.5; y-axisgoesfrom-2.4to15forthehistogram; numberof
intervalsis5forthehistogramsothewidthofanintervalis(100.5-32.5)dividedby5which
87
isequalto13.6.Endpointsoftheintervals:startingpointis32.5,32.5+13.6=46.1,46.1+13.6=
59.7,59.7+13.6=73.3,73.3+13.6=86.9,86.9+13.6=100.5=theendingvalue;Nodatavalues
fallonanintervalboundary.
Figure2.1
73=27). Thehistogram,boxplot,andchartallreﬂectthis. . ThereareasubstantialnumberofA
middle50%oftheexamscores(IQR=29)areDs,Cs,andBs.Theboxplotalsoshowsusthatthe
lower25%oftheexamscoresareDsandFs.
ComparingValuesfromDifferentDataSets
Thestandarddeviationisusefulwhencomparingdatavaluesthatcomefromdifferentdatasets.Ifthedata
 Foreachdatavalue,calculatehowmanystandarddeviationsthevalueisawayfromitsmean.
 Usetheformula:value=mean+(#ofSTDEVs)(standarddeviation);solvefor#ofSTDEVs.
 #ofSTDEVs=
value mean
standarddeviation
 Comparetheresultsofthiscalculation.
#ofSTDEVsisoftencalleda"z-score";wecanusethesymbolz.Insymbols,theformulasbecome:
Sample
x=
x+zs
z=
x
x
s
Population
x=m+zs
z=
x m
s
88
CHAPTER2. DESCRIPTIVESTATISTICS
Table2.9
Example2.23
school?
Student
GPA
SchoolMeanGPA
SchoolStandardDeviation
John
2.85
3.0
0.7
Ali
77
80
10
Table2.10
Solution
Foreachstudent,determinehowmanystandarddeviations(#ofSTDEVs)hisGPAisawayfrom
theaverage,forhisschool. Paycarefulattentiontosignswhencomparingandinterpretingthe
#ofSTDEVs=
value mean
standarddeviation
;z=
x m
s
ForJohn,z=#ofSTDEVs=
2.85 3.0
0.7
0.21
ForAli,z=#ofSTDEVs=
77 80
10
0.3
John has the e better G.P.A. when n compared to o his s school because e his s G.P.A. . is s 0.21standard
deviations below his school’s mean while Ali’s s G.P.A. . is 0.3 standard d deviations below his
school’smean.
John’s z-score e of f 0.21 1 is higher r than Ali’s z-score of f 0.3 3 . . For r GPA, higher values are
better,soweconcludethatJohnhasthebetterGPAwhencomparedtohisschool.
Thefollowinglistsgiveafewfactsthatprovidealittlemoreinsightintowhatthestandarddeviationtells
ForANYdataset,nomatterwhatthedistributionofthedatais:
 Atleast75%ofthedataiswithin2standarddeviationsofthemean.
 Atleast89%ofthedataiswithin3standarddeviationsofthemean.
 Atleast95%ofthedataiswithin41/2standarddeviationsofthemean.
 ThisisknownasChebyshev’sRule.
 Approximately68%ofthedataiswithin1standarddeviationofthemean.
 Approximately95%ofthedataiswithin2standarddeviationsofthemean.
 Morethan99%ofthedataiswithin3standarddeviationsofthemean.
 ThisisknownastheEmpiricalRule.
 Itisimportanttonotethatthisruleonlyapplieswhentheshapeofthedistributionofthedatais
sian"probabilitydistributioninlaterchapters.
**WithcontributionsfromRobertaBloom
89
2.10SummaryofFormulas
13
CommonlyUsedSymbols
 n=thenumberofdatavaluesinasample
 N=thenumberofpeople,things,etc.inthepopulation
x=thesamplemean
 s=thesamplestandarddeviation
 m=thepopulationmean
 s=thepopulationstandarddeviation
 f=frequency
 x=numericalvalue
CommonlyUsedExpressions
 xf=Avaluemultipliedbyitsrespectivefrequency
å
x=Thesumofthevalues
å
xf=Thesumofvaluesmultipliedbytheirrespectivefrequencies
 (x
x)or(x m)=Deviationsfromthemean(howfaravalueisfromthemean)
 (x
x)
2
or(m)
2
=Deviationssquared
 f(x
x)
2
orf(m)
2
=Thedeviationssquaredandmultipliedbytheirfrequencies
MeanFormulas:
x=
å
x
n
or
x=
å
fx
n
 m=
å
x
N
orm=
åfx
N
StandardDeviationFormulas:
 s=
q
S(x
x)
2
1
ors=
q
Sf(x
x)
2
1
 s=
q
S(x
m)
2
N
ors=
q
Sf(x
m)
2
N
FormulasRelatingaValue,theMean,andtheStandardDeviation:
 value=mean+(#ofSTDEVs)(standarddeviation),where#ofSTDEVs=thenumberofstandarddevi-
ations
 x=
x+(#ofSTDEVs)(s)
 x=m+(#ofSTDEVs)(s)
