61
1.1;1.5;2.3;2.5;2.7;3.2;3.3;3.3;3.5;3.8;4.0;4.2;4.5;4.5;4.7;4.8;5.5;5.6;6.5;6.7;12.3
Thedataarethedistance(inkilometers)fromahometothenearestsupermarket.
Problem
(Solutiononp.114.)
1. Arethereanyvaluesthatmightpossiblybeoutliers?
2. Dothedataseemtohaveanyconcentrationofvalues?
H
INT
:Theleavesaretotherightofthedecimal.
Anothertypeofgraphthatisusefulforspeciﬁcdatavaluesisalinegraph. Intheparticularlinegraph
shownintheexample,thex-axisconsistsofdatavaluesandthey-axisconsistsoffrequencypoints. The
frequencypointsareconnected.
Example2.3
his/herchores.Theresultsareshowninthetableandthelinegraph.
Numberoftimesteenagerisreminded
Frequency
0
2
1
5
2
8
3
14
4
7
5
4
Table2.2
Bargraphsconsistofbarsthatareseparatedfromeachother. Thebarscanberectanglesortheycanbe
rectangularboxesandtheycanbeverticalorhorizontal.
ThebargraphshowninExample4hasagegroupsrepresentedonthex-axisandproportionsonthey-axis.
62
CHAPTER2. DESCRIPTIVESTATISTICS
Example2.4
By the end d of f 2011, , in n the United d States, , Facebook k had over 146 million users. . The e table
shows three e age e groups, the number of users in each age group p and d the e proportion n (%) of
revisited-2011-statistics-2/
Agegroups
13-25
65,082,280
45%
26-44
53,300,200
36%
45-64
27,885,100
19%
Table2.3
Example2.5
The columnsinthetablebelowcontaintherace/ethnicityofU.S.PublicSchools: : HighSchool
Class of 2011, , percentages for r the Advanced Placement Examinee Population for that class
and percentages for the e Overall l Student Population. . The e 3-dimensional l graph shows the
Examinee Population n percentages on the y-axis. (Source: http://www.collegeboard.com and
Source:http://apreport.collegeboard.org/goals-and-ﬁndings/promoting-equity)
Race/Ethnicity
APExamineePopulation
OverallStudentPopulation
1=Asian,AsianAmericanorPa-
ciﬁcIslander
10.3%
5.7%
continuedonnextpage
63
2=BlackorAfricanAmerican
9.0%
14.7%
3=HispanicorLatino
17.0%
17.6%
Native
0.6%
1.1%
5=White
57.1%
59.2%
6=Notreported/other
6.0%
1.7%
Table2.4
GotoOutcomesofEducationFigure22
4
foranexampleofabargraphthatshowsunemploymentratesof
persons25yearsandolderfor2009.
NOTE
:ThisbookcontainsinstructionsforconstructingahistogramandaboxplotfortheTI-83+
Instruments(TI)website
5
.
2.4Histograms
6
setconsistsof100valuesormore.
Ahistogramconsistsofcontiguousboxes. Ithasbothahorizontalaxisandaverticalaxis. Thehorizontal
axisislabeledwithwhatthedatarepresents(forinstance,distancefromyourhometoschool).Thevertical
axisislabeledeitherFrequencyor relativefrequency. Thegraphwillhavethesameshapewitheither
4
http://nces.ed.gov/pubs2011/2011015_5.pdf
5
http://education.ti.com/educationportal/sites/US/sectionHome/support.html
6
Thiscontentisavailableonlineat<http://cnx.org/content/m16298/1.14/>.
64
CHAPTER2. DESCRIPTIVESTATISTICS
number ofdatavaluesinthe sample. . (Inthechapter r onSamplingand Data(Section1.1), wedeﬁned
 f=frequency
 n=totalnumberofdatavalues(orthesumoftheindividualfrequencies),and
 RF=relativefrequency,
then:
RF=
f
n
(2.1)
f=3,n=40,andRF=
f
n
=
3
40
=0.075
measures.
Toconstructahistogram,ﬁrstdecidehowmanybarsorintervals,alsocalledclasses,representthedata.
Manyhistogramsconsistoffrom5to15barsorclassesforclarity. Chooseastartingpointfortheﬁrst
intervaltobelessthanthesmallestdatavalue. Aconvenientstartingpointisalowervaluecarriedout
toonemoredecimalplacethanthevaluewiththemostdecimalplaces.Forexample,ifthevaluewiththe
mostdecimalplacesis6.1andthisisthesmallestvalue,aconvenientstartingpointis6.05(6.1-0.05=6.05).
Wesaythat6.05hasmoreprecision.Ifthevaluewiththemostdecimalplacesis2.23andthelowestvalue
is1.5,aconvenientstartingpointis1.495(1.5-0.005=1.495). Ifthevaluewiththemostdecimalplacesis
3.234andthelowestvalueis1.0,aconvenientstartingpointis0.9995(1.0-.0005=0.9995). Ifallthedata
happentobeintegersandthesmallestvalueis2,thenaconvenientstartingpointis1.5(2-0.5=1.5).Also,
willfallonaboundary.
Example2.6
Thefollowingdataaretheheights(ininchestothenearesthalfinch)of100malesemiprofessional
soccerplayers.Theheightsarecontinuousdatasinceheightismeasured.
60;60.5;61;61;61.5
63.5;63.5;63.5
64;64;64;64;64;64;64;64.5;64.5;64.5;64.5;64.5;64.5;64.5;64.5
66;66;66;66;66;66;66;66;66;66;66.5;66.5;66.5;66.5;66.5;66.5;66.5;66.5;66.5;66.5;66.5;67;67;
67;67;67;67;67;67;67;67;67;67;67.5;67.5;67.5;67.5;67.5;67.5;67.5
68;68;69;69;69;69;69;69;69;69;69;69;69.5;69.5;69.5;69.5;69.5
70;70;70;70;70;70;70.5;70.5;70.5;71;71;71
72;72;72;72.5;72.5;73;73.5
74
Thesmallestdatavalueis60. Sincethedatawiththemostdecimalplaceshasonedecimal(for
instance,61.5),wewantourstartingpointtohavetwodecimalplaces. Sincethenumbers0.5,
0.05,0.005,etc. areconvenientnumbers,use0.05andsubtractitfrom60,thesmallestvalue,for
theconvenientstartingpoint.
65
60-0.05=59.95whichismoreprecisethan,say,61.5byonedecimalplace. Thestartingpointis,
then,59.95.
Thelargestvalueis74.74+0.05=74.05istheendingvalue.
Next,calculatethewidthofeachbarorclassinterval.Tocalculatethiswidth,subtractthestarting
pointfromtheendingvalueanddividebythenumberofbars(youmustchoosethenumberof
barsyoudesire).Supposeyouchoose8bars.
74.05 59.95
8
=1.76
(2.2)
NOTE
:Wewillroundupto2andmakeeachbarorclassinterval2unitswide.Roundingupto2is
onewaytopreventavaluefromfallingonaboundary.Roundingtothenextnumberisnecessary
evenifitgoesagainstthestandardrulesofrounding. Forthisexample,using1.76asthewidth
wouldalsowork.
Theboundariesare:
 59.95
 59.95+2=61.95
 61.95+2=63.95
 63.95+2=65.95
 65.95+2=67.95
 67.95+2=69.95
 69.95+2=71.95
 71.95+2=73.95
 73.95+2=75.95
Theheights60through61.5inchesareintheinterval59.95-61.95. Theheightsthatare63.5are
intheinterval61.95-63.95. Theheightsthatare64through64.5areintheinterval63.95-65.95.
Theheights66through67.5areintheinterval65.95-67.95.Theheights68through69.5areinthe
interval67.95-69.95. Theheights70through71areintheinterval69.95-71.95. . Theheights72
through73.5areintheinterval71.95-73.95.Theheight74isintheinterval73.95-75.95.
Thefollowinghistogramdisplaystheheightsonthex-axisandrelativefrequencyonthey-axis.
66
CHAPTER2. DESCRIPTIVESTATISTICS
Example2.7
Thefollowingdataare thenumberofbooksbought by50part-timecollegestudentsat ABC
College.Thenumberofbooksisdiscretedatasincebooksarecounted.
1;1;1;1;1;1;1;1;1;1;1
2;2;2;2;2;2;2;2;2;2
3;3;3;3;3;3;3;3;3;3;3;3;3;3;3;3
4;4;4;4;4;4
5;5;5;5;5
6;6
largestdatavalue.Thenthestartingpointis0.5andtheendingvalueis6.5.
Problem
(Solutiononp.114.)
Next,calculatethewidthofeachbarorclassinterval.Ifthedataarediscreteandtherearenottoo
manydifferentvalues,awidththatplacesthedatavaluesinthemiddleofthebarorclassinterval
isthemostconvenient.Sincethedataconsistofthenumbers1,2,3,4,5,6andthestartingpointis
0.5,awidthofoneplacesthe1inthemiddleoftheintervalfrom0.5to1.5,the2inthemiddleof
theintervalfrom1.5to2.5,the3inthemiddleoftheintervalfrom2.5to3.5,the4inthemiddleof
theintervalfrom_______to_______,the5inthemiddleoftheintervalfrom_______to_______,
andthe_______inthemiddleoftheintervalfrom_______to_______.
67
Calculatethenumberofbarsasfollows:
6.5 0.5
bars
=1
(2.3)
where1isthewidthofabar.Therefore,bars=6.
The followinghistogramdisplaysthe numberofbooksonthe x-axisandthefrequencyonthe
y-axis.
UsingtheTI-83,83+,84,84+CalculatorInstructions
dataandforcreatingacustomizedhistogram.CreatethehistogramforExample2.
 PressY=.PressCLEARtoclearoutanyequations.
 PressSTAT1:EDIT.IfL1hasdatainit,arrowupintothenameL1,pressCLEARandarrowdown.If
necessary,dothesameforL2.
 IntoL1,enter1,2,3,4,5,6
 IntoL2,enter11,10,16,6,5,2
 PressWINDOW.MakeXmin=.5,Xmax=6.5,Xscl=(6.5-.5)/6,Ymin=-1,Ymax=20,Yscl=1,Xres
=1
 Press2ndY=.Startbypressing4:PlotsoffENTER.
 Press2ndY=. Press1:Plot1. PressENTER.ArrowdowntoTYPE.Arrowtothe e 3rd picture(his-
togram).PressENTER.
 ArrowdowntoXlist:EnterL1(2nd1).ArrowdowntoFreq.EnterL2(2nd2).
 PressGRAPH
 UsetheTRACEkeyandthearrowkeystoexaminethehistogram.
2.4.1OptionalCollaborativeExercise
Countthemoney(billsandchange)inyourpocketorpurse.Yourinstructorwillrecordtheamounts.Asa
class,constructahistogramdisplayingthedata.Discusshowmanyintervalsyouthinkisappropriate.You
maywanttoexperimentwiththenumberofintervals.Discuss,also,theshapeofthehistogram.
Recordthedata,indollars(forexample,1.25dollars).
Constructahistogram.
68
CHAPTER2. DESCRIPTIVESTATISTICS
2.5BoxPlots
7
Boxplotsorbox-whiskerplotsgiveagoodgraphicalimageoftheconcentrationofthedata. Theyalso
showhowfarfrommostofthedatatheextremevaluesare. Theboxplotisconstructedfromﬁvevalues:
thesmallestvalue,theﬁrstquartile,themedian,thethirdquartile,andthelargestvalue.Themedian,the
ﬁrstquartile,andthethirdquartilewillbediscussedhere,andthenagaininthesectiononmeasuringdata
inthischapter.Weusethesevaluestocomparehowcloseotherdatavaluesaretothem.
Themedian,anumber,isawayofmeasuringthe"center"ofthedata. Youcanthinkofthemedianasthe
"middlevalue,"althoughitdoesnotactuallyhavetobeoneoftheobservedvalues. Itisanumberthat
1;11.5;6;7.2;4;8;9;10;6.8;8.3;2;2;10;1
Orderedfromsmallesttolargest:
1;1;2;2;4;6;6.8;7.2;8;8.3;9;10;10;11.5
togetheranddivideby2.
6.8+7.2
2
=7
(2.4)
Themedianis7.Halfofthevaluesaresmallerthan7andhalfofthevaluesarelargerthan7.
Quartilesarenumbersthatseparatethedataintoquarters. Quartilesmayormaynotbepartofthedata.
Toﬁndthequartiles,ﬁrstﬁndthemedianorsecondquartile. Theﬁrstquartileisthemiddlevalueofthe
lowerhalfofthedataandthethirdquartileisthemiddlevalueoftheupperhalfofthedata. Togetthe
idea,considerthesamedatasetshownabove:
1;1;2;2;4;6;6.8;7.2;8;8.3;9;10;10;11.5
Themedianorsecondquartileis7.Thelowerhalfofthedatais1,1,2,2,4,6,6.8.Themiddlevalueofthe
lowerhalfis2.
1;1;2;2;4;6;6.8
Thenumber2,whichispartofthedata,istheﬁrstquartile. One-fourthofthevaluesarethesameorless
than2andthree-fourthsofthevaluesaremorethan2.
Theupperhalfofthedatais7.2,8,8.3,9,10,10,11.5.Themiddlevalueoftheupperhalfis9.
7.2;8;8.3;9;10;10;11.5
Thenumber9,whichispartofthedata,isthethirdquartile. Three-fourthsofthevaluesarelessthan9
andone-fourthofthevaluesaremorethan9.
Toconstructaboxplot,useahorizontalnumberlineandarectangularbox. Thesmallestandlargestdata
valueslabeltheendpointsoftheaxis. Theﬁrstquartilemarksoneendoftheboxandthethirdquartile
markstheotherendofthebox. Themiddleﬁftypercentofthedatafallinsidethebox. The"whiskers"
extendfromtheendsoftheboxtothesmallestandlargestdatavalues. Theboxplotgivesagoodquick
pictureofthedata.
7
Thiscontentisavailableonlineat<http://cnx.org/content/m16296/1.13/>.
69
NOTE
:Youmayencounterboxandwhiskerplotsthathavedotsmarkingoutliervalues.Inthose
cases,thewhiskersarenotextendingtotheminimumandmaximumvalues.
Considerthefollowingdata:
1;1;2;2;4;6;6.8;7.2;8;8.3;9;10;10;11.5
Theﬁrstquartileis2,themedianis7,andthethirdquartileis9. Thesmallestvalueis1andthelargest
value is 11.5. The box plot is constructed as follows (see calculator instructions in the back of this book or on the TI website):
ontheTIwebsite
8
):
Thetwowhiskersextendfromtheﬁrstquartiletothesmallestvalueandfromthethirdquartiletothe
Example2.8
Thefollowingdataaretheheightsof40studentsinastatisticsclass.
59;60;61;62;62;63;63;64;64;64;65;65;65;65;65;65;65;65;65;66;66;67;67;68;68;69;70;70;70;
70;70;71;71;72;72;73;74;74;75;77
Constructaboxplot:
UsingtheTI-83,83+,84,84+Calculator
 Enterdataintothelisteditor(PressSTAT1:EDIT).Ifyouneedtoclearthelist,arrowupto
thenameL1,pressCLEAR,arrowdown.
 PutthedatavaluesinlistL1.
 PressSTATandarrowtoCALC.Press1:1-VarStats.EnterL1.
 PressENTER
 Usethedownanduparrowkeystoscroll.
 Smallestvalue=59
 Largestvalue=77
 Q1:Firstquartile=64.5
 Q2:Secondquartileormedian=66
 Q3:Thirdquartile=70
UsingtheTI-83,83+,84,84+toConstructtheBoxPlot
Goto14:AppendixforNotesfortheTI-83,83+,84,84+Calculator.Tocreatetheboxplot:
 PressY=.Ifthereareanyequations,pressCLEARtoclearthem.
 Press2ndY=.
 Press4:Plotsoff.PressENTER
8
http://education.ti.com/educationportal/sites/US/sectionHome/support.html
