﻿
541
12.8Prediction
8
Recallthethirdexam/ﬁnalexamexample.
Weexaminedthescatterplotandshowedthatthecorrelationcoefﬁcientissigniﬁcant.Wefoundtheequa-
usetheleastsquaresregressionlineforprediction.
thirdexam. Theexamscores(x-values)rangefrom65to75. Since73isbetweenthex-values65and75,
substitutex=73intotheequation.Then:
^
y
173.51+4.83(73)=179.08
(12.8)
theﬁnalexam,onaverage.
Example12.11
Recallthethirdexam/ﬁnalexamexample.
Problem1
Whatwouldyoupredicttheﬁnalexamscoretobeforastudentwhoscoreda66onthethird
exam?
Solution
145.27
Problem2
(Solutiononp.579.)
Whatwouldyoupredicttheﬁnalexamscoretobeforastudentwhoscoreda90onthethird
exam?
**WithcontributionsfromRobertaBloom
12.9Outliers
9
Insomedatasets, there arevalues(observeddatapoints)called outliersOutliersareobserveddata
pointsthatarefarfromtheleastsquaresline.Theyhavelarge"errors",wherethe"error"orresidualisthe
verticaldistancefromthelinetothepoint.
Outliersneedtobeexaminedclosely.Sometimes,forsomereasonoranother,theyshouldnotbeincluded
intheanalysisofthedata.Itispossiblethatanoutlierisaresultoferroneousdata.Othertimes,anoutlier
Besidesoutliers,asamplemaycontainoneorafewpointsthatarecalled inﬂuentialpoints. Inﬂuential
pointsareobserveddatapointsthatarefarfromtheotherobserveddatapointsinthehorizontaldirection.
Thesepointsmayhaveabigeffectontheslopeoftheregressionline. Tobegintoidentifyaninﬂuential
point,youcanremoveitfromthedatasetandseeiftheslopeoftheregressionlineischangedsigniﬁcantly.
8
Thiscontentisavailableonlineat<http://cnx.org/content/m17095/1.8/>.
9Thiscontentisavailableonlineat<http://cnx.org/content/m17094/1.14/>.
AvailableforfreeatConnexions<http://cnx.org/content/col10522/1.40>
Pdf to powerpoint converter online - C# Create PDF from PowerPoint Library to convert pptx, ppt to PDF in C#.net, ASP.NET MVC, WinForms, WPF
Online C# Tutorial for Creating PDF from Microsoft PowerPoint Presentation
converter pdf to powerpoint; convert pdf to ppt online without email
Pdf to powerpoint converter online - VB.NET Create PDF from PowerPoint Library to convert pptx, ppt to PDF in vb.net, ASP.NET MVC, WinForms, WPF
VB.NET Tutorial for Export PDF file from Microsoft Office PowerPoint
pdf to powerpoint slide; convert pdf to ppt
542
CHAPTER12. LINEARREGRESSIONANDCORRELATION
Computersand manycalculatorscanbe usedtoidentifyoutliersfromthe data. . Computeroutputfor
regressionanalysiswilloftenidentifybothoutliersandinﬂuentialpointssothatyoucanexaminethem.
IdentifyingOutliers
Wecouldguessatoutliersbylookingatagraphofthescatterplotandbestﬁtline.Howeverwewouldlike
someguidelineastohowfarawayapointneedstobeinordertobeconsideredanoutlier. Asaroughrule
ofthumb,wecanﬂaganypointthatislocatedfurtherthantwostandarddeviationsaboveorbelowthe
bestﬁtlineasanoutlier.Thestandarddeviationusedisthestandarddeviationoftheresidualsorerrors.
Wecandothisvisuallyinthescatterplotbydrawinganextrapairoflinesthataretwostandarddeviations
aboveandbelowthebestﬁtline. Anydatapointsthatareoutsidethisextrapairoflinesareﬂaggedas
potentialoutliers.Orwecandothisnumericallybycalculatingeachresidualandcomparingittotwicethe
standarddeviation. OntheTI-83,83+,or84+,thegraphicalapproachiseasier. Thegraphicalprocedure
isshownﬁrst,followedbythenumericalcalculations. Youwouldgenerallyonlyneedtouseoneofthese
methods.
Example12.12
Inthethirdexam/ﬁnalexamexample,youcandetermineifthereisanoutlierornot. Ifthereis
anoutlier,asanexercise,deleteitandﬁttheremainingdatatoanewline. Forthisexample,the
newlineoughttoﬁttheremainingdatabetter. ThismeanstheSSEshouldbesmallerandthe
correlationcoefﬁcientoughttobecloserto1or-1.
Solution
GraphicalIdentiﬁcationofOutliers
WiththeTI-83,83+,84+graphingcalculators,itiseasytoidentifytheoutliergraphicallyandvisu-
ally. Ifweweretomeasuretheverticaldistancefromanydatapointtothecorrespondingpoint
onthelineofbestﬁtandthatdistancewasequalto2sorfarther,thenwewouldconsiderthedata
pointtobe"toofar"fromthelineofbestﬁt. Weneedtoﬁndandgraphthelinesthataretwo
standarddeviationsbelowandabovetheregressionline. Anypointsthatareoutsidethesetwo
linesareoutliers.WewillcalltheselinesY2andY3:
Aswedidwiththe equationoftheregressionlineandthecorrelationcoefﬁcient,we willuse
technologyto calculatethisstandarddeviationforus. . UsingtheLinRegTTestwiththisdata,
scrolldownthroughtheoutputscreenstoﬁnds=16.412
LineY2=-173.5+4.83x-2(16.4)andlineY3=-173.5+4.83x+2(16.4)
where
^
y
=-173.5+4.83x is the line of best t ﬁt. . Y2 2 and d Y3 3 have the same e slope as s the line of
bestﬁt.
GraphthescatterplotwiththebestﬁtlineinequationY1,thenenterthetwoextralinesasY2and
Y3inthe"Y="equationeditorandpressZOOM9.Youwillﬁndthattheonlydatapointthatisnot
betweenlinesY2andY3isthepointx=65,y=175.Onthecalculatorscreenitisjustbarelyoutside
exam;thispointisfurtherthan2standarddeviationsawayfromthebestﬁtline.
Sometimesapointissoclosetothelinesusedtoﬂagoutliersonthegraphthatitisdifﬁculttotell
ifthepointisbetweenoroutsidethelines. Onacomputer,enlargingthegraphmayhelp;ona
smallcalculatorscreen,zoominginmaymakethegraphclearer. Notethatwhenthegraphdoes
notgiveaclearenoughpicture,youcanusethenumericalcomparisonstoidentifyoutliers.
AvailableforfreeatConnexions<http://cnx.org/content/col10522/1.40>
Online Convert PowerPoint to PDF file. Best free online export
Online Powerpoint to PDF Converter. Download Free Trial. Convert a PPTX/PPT File to PDF. Just upload your file by clicking on the blue
pdf to ppt converter online; convert pdf back to powerpoint
XDoc.Converter for .NET, Support Documents and Images Conversion
Convert PowerPoint to ODP. Convert to Tiff. Convert Word, Excel and PDF to image. Next Steps. Download and try Converter for .NET with online support. See Pricing
how to convert pdf to powerpoint on; convert pdf file to ppt online
543
Figure12.16
NumericalIdentiﬁcationofOutliers
Inthe table below, , the e ﬁrst two columns s are e the third examandﬁnalexamdata. . The e third
columnshowsthepredicted
^
y
valuescalculatedfromthelineofbestﬁt:
^
y
=-173.5+4.83x. The
residuals,orerrors, havebeencalculatedinthefourthcolumnofthetable: : observedyvalue
predictedyvalue=y
^
y
.
sisthestandarddeviationofallthey
^
y=
evalueswheren=thetotalnumberofdatapoints.If
deviationoftheresidualsiscalculatedfromtheSSEas:
s=
q
SSE
2
Ratherthancalculatethevalueofsourselves,wecanﬁndsusingthecomputerorcalculator.For
thisexample,thecalculatorfunctionLinRegTTestfound16.4asthestandarddeviationofthe
residuals35;-17;16;-6;-19;9;3;-1;-10;-9;-1.
AvailableforfreeatConnexions<http://cnx.org/content/col10522/1.40>
C#: How to Use SDK to Convert Document and Image Using XDoc.
You may use our converter SDK to easily convert PDF, Word, Excel, PowerPoint, Tiff, and Dicom files to raster images like Jpeg, Png, Bmp and Gif.
convert pdf slides to powerpoint online; how to convert pdf slides to powerpoint presentation
RasterEdge XDoc.PowerPoint for .NET - SDK for PowerPoint Document
how to convert pdf to powerpoint; pdf to powerpoint converter
544
CHAPTER12. LINEARREGRESSIONANDCORRELATION
x
y
^
y
y
^
y
65
175
140
175 140=35
67
133
150
133 15017
71
185
169
185 169=16
71
163
169
163 1696
66
126
145
126 14519
75
198
189
198 189=9
67
153
150
153 150=3
70
163
164
163 1641
71
159
169
159 16910
69
151
160
151 1609
69
159
160
159 1601
Table12.1
Wearelookingforalldatapointsforwhichtheresidualisgreaterthan2s=2(16.4)=32.8orlessthan
-32.8.Comparethesevaluestotheresidualsincolumn4ofthetable.Theonlysuchdatapointis
thisstudentis35.
Howdoestheoutlieraffectthebestﬁtline?
Numericallyandgraphically,wehaveidentiﬁedthepoint(65,175)asanoutlier. Weshouldre-
examinethedataforthispointtoseeifthereareanyproblemswiththedata. Ifthereisanerror
weshouldﬁxtheerrorifpossible,ordeletethedata. Ifthedataiscorrect,wewouldleaveitin
thedataset. Forthisproblem,wewillsupposethatweexaminedthedataandfoundthatthis
outlierdatawasanerror. Thereforewewillcontinueonanddeletetheoutlier,sothatwecan
explorehowitaffectstheresults,asalearningexperience.
Computeanewbest-ﬁtlineandcorrelationcoefﬁcientusingthe10remainingpoints:
OntheTI-83,TI-83+,TI-84+calculators,deletetheoutlierfromL1andL2.UsingtheLinRegTTest,
thenewlineofbestﬁtandthecorrelationcoefﬁcientare:
^
y=
355.19+7.39xandr=0.9121
Thenewlinewith0.9121isastrongercorrelationthantheoriginal(r=0.6631)because=
0.9121iscloserto1. Thismeansthatthenewlineisabetterﬁttothe10remainingdatavalues.
Thelinecanbetterpredicttheﬁnalexamscoregiventhethirdexamscore.
NumericalIdentiﬁcationofOutliers:CalculatingsandFindingOutliersManually
IfyoudonothavethefunctionLinRegTTest, thenyoucancalculatethe outlierintheﬁrst exampleby
doingthefollowing.
First,squareeachjy
^
y
j(SeetheTABLEabove):
AvailableforfreeatConnexions<http://cnx.org/content/col10522/1.40>
VB.NET PDF - Convert PDF Online with VB.NET HTML5 PDF Viewer
All Formats. XDoc.HTML5 Viewer. XDoc.Windows Viewer. XDoc.Converter. View & Process. Adobe PDF. XDoc.PDF. Scanning. XDoc.Word. XDoc.Excel. XDoc.PowerPoint. Barcoding
convert pdf to ppt; conversion of pdf into ppt
VB.NET PDF- View PDF Online with VB.NET HTML5 PDF Viewer
All Formats. XDoc.HTML5 Viewer. XDoc.Windows Viewer. XDoc.Converter. View & Process. Adobe PDF. XDoc.PDF. Scanning. XDoc.Word. XDoc.Excel. XDoc.PowerPoint. Barcoding
convert pdf file to powerpoint; how to convert pdf to powerpoint slides
545
Thesquaresare35
2
;17
2
;16
2
;6
2
;19
2
;9
2
;3
2
;1
2
;10
2
;9
2
;1
2
^
y
jsquaredtermsusingtheformula
11
S
i=1
jy
i
^
y
i
j
!
2
=
11
S
i=1
e
i
2
(Recallthaty
i
^
y
i
=e
i
.)
=35
2
+17
2
+16
2
+6
2
+19
2
+9
2
+3
2
+1
2
+10
2
+9
2
+1
2
=2440=SSE.Theresult,SSEistheSumofSquaredErrors.
Next,calculates,thestandarddeviationofallthey
^
y
evalueswheren=thetotalnumberofdata
points.
Thecalculationiss=
q
SSE
2
Forthethirdexam/ﬁnalexamproblem,s=
q
2440
11 2
=16.47
Next,multiplysby1.9:
(1.9)(16.47)=31.29
31.29isalmost2standarddeviationsawayfromthemeanofthey
^
y
values.
Ifweweretomeasuretheverticaldistancefromanydatapointtothecorrespondingpointonthelineof
bestﬁtandthatdistanceisatleast1.9s,thenwewouldconsiderthedatapointtobe"toofar"fromtheline
ofbestﬁt.Wecallthatpointapotentialoutlier.
Fortheexample, ifanyofthe jy
^
y
jvaluesareatleast31.29,thecorresponding(x,y)datapointisa
potentialoutlier.
Forthethirdexam/ﬁnalexamproblem,allthejy
^
y
j’sarelessthan31.29exceptfortheﬁrstonewhichis
35.
35>31.29
Thatis,jy
^
y
j(1.9)(s)
Thepointwhichcorrespondstojy
^
y
j=35is(65,175).Therefore,thedatapoint(65,175)isapotential
outlier.Forthisexample,wewilldeleteit.(Remember,wedonotalwaysdeleteanoutlier.)
The next step is s to o compute a a new best-ﬁt t line e usingthe e 10remaining g points. . The e new line ofbest
ﬁtandthecorrelationcoefﬁcientare:
^
y
355.19+7.39xandr=0.9121
Example12.13
Usingthisnewlineofbestﬁt(basedontheremaining10datapoints),whatwouldastudent
AvailableforfreeatConnexions<http://cnx.org/content/col10522/1.40>
C# HTML5 PDF Viewer SDK to convert and export PDF document to
All Formats. XDoc.HTML5 Viewer. XDoc.Windows Viewer. XDoc.Converter. View & Process. Adobe PDF. XDoc.PDF. Scanning. XDoc.Word. XDoc.Excel. XDoc.PowerPoint. Barcoding
converting pdf to powerpoint slides; images from pdf to powerpoint
DocImage SDK for .NET: Web Document Image Viewer Online Demo
Try Online Demo Now. Please click Browse to upload a file to display in web viewer. Suppported files are Word, Excel, PowerPoint, PDF, Tiff, Dicom and main
change pdf to powerpoint on; convert pdf to editable ppt
546
CHAPTER12. LINEARREGRESSIONANDCORRELATION
Solution
Usingthenewlineofbestﬁt,
^
y
355.19+7.39(73) 184.28.Astudentwhoscored73points
onthethirdexamwouldexpecttoearn184pointsontheﬁnalexam.
The original l line e predicted
^
y
173.51+4.83(73) =
179.08 so o the prediction n using the
newlinewiththeoutliereliminateddiffersfromtheoriginalprediction.
Example12.14
(FromTheConsumerPriceIndexesWebsite)TheConsumerPriceIndex(CPI)measurestheaver-
agechangeovertimeinthepricespaidbyurbanconsumersforconsumergoodsandservices.The
CPIaffectsnearlyallAmericansbecauseofthemanywaysitisused.Oneofitsbiggestusesisas
Congress,andtheFederalReserveBoardusetheCPI’strendstoformulatemonetaryandﬁscal
policies.Inthefollowingtable,xistheyearandyistheCPI.
Data:
x
y
1915
10.1
1926
17.7
1935
13.7
1940
14.7
1947
24.1
1952
26.5
1964
31.0
1969
36.7
1975
49.3
1979
72.6
1980
82.4
1986
109.6
1991
130.7
1999
166.6
Table12.2
Problem
 Makeascatterplotofthedata.
 Calculatetheleastsquaresline.Writetheequationintheform
^
y
=a+bx.
 Drawthelineonthescatterplot.
 Findthecorrelationcoefﬁcient.Isitsigniﬁcant?
 WhatistheaverageCPIfortheyear1990?
AvailableforfreeatConnexions<http://cnx.org/content/col10522/1.40>