> 130W5@ 0,!bjbj22 (NXX'@@@@@@@,
|4,O2L((((((($RӠ|ܜ-@(((((ܜ@@(( L
L
L
("@(@(L
(L
LL
B>@@(@NM
Jf0Oh2OvOTT@@@@O@((L
(((((ܜܜ,,d
B
,,
CHAPTER THREE
Describing Data Using
Numerical Measures
Although graphs and charts provide effective tools for transforming data into information, they do not reveal all the information contained in a data set. So, to make your description complete, you need to become familiar with the key descriptive measures that quantify the center of the data and its spread.
3-1 Measures of Center and Location
Objectives
To compute the mean, the weighted average, the median and the mode for a set of data and understand what these values represent.
To compute the percentiles and quartiles for a set of data and understand what these values represent.
To construct a box plot and interpret it.
Recall that the parameter does not change because it is based on all the population values, where this is not the case for the statistic where it is based only on a random sample of the population.
The mean (or the average) is computed, for a quantitative measure, by dividing the sum of the values by the values number.
The population mean is given by the formula; EMBED Equation.3 where N: population size and xi :ith individual value of variable x.
For any set of data, the sum of the deviations around the mean will be zero.
The sample mean is given by the formula; EMBED Equation.3 where n: sample size and xi :ith individual value of variable x.
Note that sample descriptors (statistics) are usually assigned a Roman character where the parameters are usually assigned Greek ones.
The median divides the data array into two halves since it is exactly in the middle. We use EMBED Equation.3 to denote the population median and Md to denote the sample median.
The median is found following the steps;
Sort the data increasingly.
If n is odd then the median is the value in the middle = EMBED Equation.3 that is the data observation that has a rank or a position EMBED Equation.3 .
If n is even then the median is the mean value of the two values in the middle = EMBED Equation.3 .
The set of data is said to be symmetric if its observations are evenly spread around the center and the mean and median are equal. If it is not symmetric then it is called skewed.
The mode is the value in the data set that occurs most frequently. The mode needs not to be unique for multi-modal data sets.
In many cases the observation in the data set are equally weighted but, in some applications there is a reason to weight the data values differently. In that case we need to compute the weighted mean that is based on assigning weights for the data observations according to its importance or occurrence.
The weighted mean is given by the formula; EMBED Equation.3 for the population and EMBED Equation.3 for the sample where wi is the weight of the ith data value.
A very well familiar example of calculating the weighted mean is your GPA.
Recall that prior to enroll at the university you took the SAT test and received a percentile score in math and verbal skills. So, the pth percentile in a data array is a value that divides the data set into two parts in which the lower segment contains at least p% of the data and the upper contains the rest.
The pth percentile is denoted by Pp. Also, the 50th percentile is the median of the data set.
The following procedure is used to calculate the pth percentile for a quantitative set of data;
Sort the data increasingly.
Determine the percentile location value; EMBED Equation.3 where p: desired percentile.
If i is not an integer then interpolate between the integer portion of i and the next value; EMBED Equation.3 where d: the decimal part of i.
The quartiles are special cases of the percentiles. It is those values that divide the data set into four equal-sized groups. It is denoted by Q1, Q2 and Q3 which are the first, second and third quartiles respectively.
The box-and-whisker plot incorporates the median and the quartiles to graphically display quantitative data. It is also use to identify the outliers, the extremely small or large values, in the data set.
Use the following steps to construct a box-and-whisker plot:
Sort the data increasingly.
Find the three quartiles.
Draw a box so that the ends of the box are at Q1 and Q3.
Draw a vertical line through the box at the median.
Calculate IQR = Q3 Q1.
Compute LL=Q11.5IQR and UL=Q3+1.5IQR.
Extend dashed lines from each end of the box to the lowest and highest values.
Any value outside (outlier) the upper or lower limits is marked with an asterisk.
Refer to the comparison table given in the text book on page 92.
3-2 Measures of Variation
Objectives
To compute the range, the variance, and the standard deviation for a set of data and understand what these values represent.
A set of data exhibits variation (dispersion) if all the data are not on the same value.
The range R = Max. value Min. value.
The inter quartile range IQR = Q3 Q1.
The population variance is the average of the squared distances (deviations) of the data values from the mean. It is given by the formula: EMBED Equation.3 .
The population standard deviation () is the positive square root of the variance.
The sample variance is the average of the squared distances (deviations) of the data values from the sample mean (division is by n 1). It is given by the formula: EMBED Equation.3 .
The sample standard deviation (s) is the positive square root of the variance.
3-3 Using the Mean & the Standard Deviation Together
Objectives
To compute the coefficient of variation and z scores and understand how they are applied in decision-making.
To introduce the empirical rule and Tchebycheffs theorem.
The coefficient of variation is used to measure the relative variation in data. It is mainly used for the comparison of two data sets of different units.
Note that the empirical rule is used when the data are approximately bell-shaped (symmetric) whereas, the Tchebycheffs theorem applies, in general, for any set of data regardless of its distribution.
Another useful tool for comparing data sets of different measuring units is the z-score, which measures the number of standard deviations a value is from the mean.
#$678AR= E l m n o q r û{l^S^A#hCmh.g5B*
CJ8\aJ8phhB5CJ8\aJ8hh.g5CJ8\aJ8*h1h.g5CJ8\aJ8*hB5CJ8\aJ8h;hkCJ8aJ8hDCJ8aJ8 h`=h{f[SG8*h1hCm5CJ8\aJ8*hN5CJ8\aJ8hCmCJ8aJ8hX 6CJ8]aJ8)h
_hX 56B*
CJ8\]aJ8ph h
_hX 56CJ8\]aJ8hX 56CJ8\]aJ8!j!hZLhX CJ8EHUaJ8jՆE
hX CJUVaJjhX CJ8UaJ8hu~hX CJ8aJ8)hu~hX 56B*
CJ8\]aJ8ph#hX 56B*
CJ8\]aJ8phhX CJ8aJ8>?@OTpq"#$%)At ⺫tld\G\)h&Vh&V56B*
CJ8\]aJ8phh&VCJ8aJ8h2 CJ8aJ8hRCJ8aJ8)h&Vh P56B*
CJ8\]aJ8phh PCJ8aJ8h56hCmCJ8aJ8hCm5B*
CJ8\aJ8phhD5B*
CJ8\aJ8ph#h&KhCm5B*
CJ8\aJ8phh5CJ8\aJ8hrE5CJ8\aJ8hhCm5CJ8\aJ8*hCmhCm5CJ8\aJ8<q|$% ,!0*I=0*I=0*I=0*I=0*I=0*X0*"0*X$
&FA$a$gdIN$
&FA$a$gd&V$
&FA$a$gd2 $A$a$gdCm$
&F$A$a$gdD $A$a$gdCm $A$a$gdrE ) > +!,!º¥hNjCJ8aJ8)hINhIN56B*
CJ8\]aJ8phhINCJ8aJ8hRCJ8aJ8h PCJ8aJ8)h&Vh&V56B*
CJ8\]aJ8phh&VCJ8aJ8 hRh&V56CJ8\]aJ8&1h:p2 / =!"#$%Dd
b
c$A??3"`?2D>
Ƙ,:7՚D`!D>
Ƙ,:7՚@8 L xڥS=KA|Co4DSNQK!?`
` $'bmFQ+mQs۽zwofg1p;&o0tYr8ةbl4vNjŞ9 8qWsItr2=yTMݢXmkfM%4wă9xi2/GVfU6
G|`ǧi,+\3,oKOY Ha}i_|IW?t8QRS;ɻ=Zުy~7'wC877
hN+c
]r\|gL/>0vd"Dd
b
c$A??3"`?2J!`"l`!J!`"lrL L xcdd``$d@9`,&FF(`TMRcgbR 6
@
UXRY
T,@1[i&s$%d12BXf2AXVDYB2sSRsb1C3X?Dd
@b
c$A??3"`?2aeU|̬*Mz=o`!5eU|̬*MzR xmPJA}39XvץRp5B͒/
O,M
qav>YB>0ΖDkDWUvcQGQ=TLwAsBIq>.uϖja*Yٛ|!cz)ߕ?S.c!oK>3T ^֖|̯vxHFwzg{l[Wb>ɴgp92kYt?cDd
O | b
c$A??3"`?2"³4*Ab?]`!"³4*Ab?]``::Oxcdd``g2
ĜL0##0KQ*
WÔ,d3H1)fYˀX
jx|K2B*R>0pEM` e-db Yۘ@ͤr#X=@Hfnj_jBP~nbM\u@pj&##,L y_`yJ F%@W&09rsa L]2/Lps&AmGm`CE%4&
]`PĤ\Y\bPd+=q֡ae?Dd
lb
c$A??3"`?2(z`xO
e `!](z`xO
&`:+xcdd``f2
ĜL0##0KQ*
W9YRcgbR 3T 憪aM,,Hab e-f Y wfjQ9A $37X/\!(?71&:Q VS`5lW'7oFP~!coȀ|&I9u@NYg2&0DwCe2!\P.v0o(6121)WĸA*q֡ObDd
b
c$A??3"`?2fGنlP(`!fGنlP``\:xڥS=K`7jC(_t\R!hB(A9Cw_"lA}=瞻{?LvLX}L!&R[$My9X(po?2wwj$lT[{ ão\-jq2,6F{%f:$%ץCCNNg(vMg?rAgnc.XN4IXH"*=zL~S+Bb> `ײ_|fM$.Nc"V}P}k!
*2Yj<
+\1qMq.(l%7ؒ\76Ht'
MfD̂I8df;#ځ>Dd
(b
c$A??3"`?27d=1U75x`!7d=1U75x6 @xڥS;KA[.A
GqZ)j/Ѐu=T#b?@D,4̹Ĝ.|37H?0bqK`CBG0"
!"#$%&')*+,-./L256798:<;=?>@ABCDFEGIHJKTUMNOPQRSVYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~Root EntryA FpM4@Data
(2WordDocument@(NObjectPoolCP76MpM_1158343403FP76M0BMOle
CompObjfObjInfo!"%(),/0369<=@CDEHKLMNPQRSTUWXYZ\
FMicrosoft Equation 3.0DS EquationEquation.39qP[>
=xii=1N
"
N
FMicrosoft Equation 3.0DS EqEquation Native l_1158344014F0BM0BMOle
CompObj
fuationEquation.39qTR>
2x=xii=1n
"
n
FMicrosoft Equation 3.0DS EquationEquation.39qObjInfo
Equation Native
p_1158344395F0BM0BMOle
CompObj
fObjInfoEquation Native -_1158380018'6F0BM hDMZ>
"
FMicrosoft Equation 3.0DS EquationEquation.39q7hS`
x(n+12)Ole
CompObjfObjInfoEquation Native S_1158383757F hDM hDMOle
CompObjfObjInfo
FMicrosoft Equation 3.0DS EquationEquation.39q"4r
n+12
FMicrosoft Equation 3.0DS EquationEquation.39qEquation Native >_1158345066 F hDMKMOle
CompObj fObjInfo!Equation Native _1158347566$FKMKMOle
#q,>
x(n2)+x(n2+1)2
FMicrosoft Equation 3.0DS EquationEquation.39qH.>
w
=CompObj#%$fObjInfo&&Equation Native '_1158347666",)FKMKMxi
wii=1N
"
wii=1N
"
FMicrosoft Equation 3.0DS EquationEquation.39q¡
Lb
2xw
=xi
wii=1n
Ole
*CompObj(*+fObjInfo+-Equation Native ."
wii=1n
"
FMicrosoft Equation 3.0DS EquationEquation.39q>Z>
i=p100(n+1)_11583722481.FKMKMOle
1CompObj-/2fObjInfo04Equation Native 5Z_11583725753FKMNMOle
7CompObj248f
FMicrosoft Equation 3.0DS EquationEquation.39q{[>
P=x(i)
+d(x(i+1)
"x(i)
)
FMicrosoft Equation 3.0DS EqObjInfo5:Equation Native ;_1158776097;8FNMNMOle
>CompObj79?fObjInfo:AEquation Native B_1158776533=FNMNMuationEquation.39qZ>
2
=x2
")(x
"
)2
N
"
N=x2
"N2
"
N
FMicrosoft Equation 3.0DS EqOle
FCompObj<>GfObjInfo?IEquation Native JuationEquation.39qH.>
s2
=x2
")(x
"
)2
n
"
n"1=x2
"n2X2
"
n"1Oh+'0|鍑7خͽ!ɐ{'J@[q`&
=yUX(Qܽ-Cx=ܻDU<UAϯ@v)n$ &ٍ1P
}9l5?SΔ֩iN}_͡tP46/7[
wv*gz(
,H\μ0qX~Ÿ40kű/ '4»hq^ﴺ_bwqT0|}PZI$: /b`O&ͣn%3
Dd
(b
c$A??3"`?2.;V`ޭ`!.;V`ޭ@4xڥS;KA%3 bqZ|B~@0P0` ^s&]zEh#3.|37;Plq+, QA'*Y֍A'CU
&ʱ`1S\O;[Y: 3
4fDrc
oyȘ%Zཷ*yӨ!DSNn n<[h#3Q:y)/s$%}?`GmvuƦ)NRN=
_C)kϣ7nKْ\e?t3)zE

!)t8`J,[7L3DAY/L.p{b;7121)WĸAS>1C#3X?Dd
8pTib
c$A??3"`?2[%Ծj;7e`!/%Ծj;z D#xڝKQgz~x8BB"D`@ Aa!,qZY`ҤO5]ߏݑ$gwg;9Hm%̡#C
PyXbDRV-
;֩i&Ҹ4}[Dl6'%lu2=?+5L-h1"ձ}|Wֱ}|]uCXR{IkqU^ByoyõҪ'oWygn`2Mi>f5Kө K͘|u
4b.$Zk/=/X2L1B=d~9\?7Ϩ՟|=6)FuhY[s|M|6ZMU(ᏥҿJ6|5nȱLԼנּG# 轼H4q%f!wg77t1TableXSummaryInformation(BODocumentSummaryInformation8V(CompObj[j
8D
P\dltCHAPTER TWOHAPRaid F. AnabosiaidaidNormal Raid F. Anabosi52dMicrosoft Word 10.0@z9@F%p@-)՜.+,0hp
KFUPM +A
CHAPTER TWOTitle
FMicrosoft Word Document
MSWordDocWord.Document.89qN@N.gNormal$A$CJ_HaJmH sH tH DADDefault Paragraph FontRiRTable Normal4
l4a(k(No List'N z z z z z z z z _7'1ZM,5$78mnv{Jt A
;r0
L
?#?YU4+67lw )00000000 0 0 00 0 0 0 0 0 0 0 0% 0% 0% 0' 0' 0' 0' 0' 0' 0' 0' 0) 0) 0) 0, 0+ 0+ 0. 0. 0. 0. 0. 0. 0. 0. 00 0000 00 0 0 0
0 0 0
0000$ 0$ 00 0 0 0
Cw`> ,! "#$&vL<,!!%,!24-AC[oql
u
'::::::::::::ik
)SZ")R[
)::::::::::#n0
L
#Y))Raid F. Anabosianabosircimfac0~NRR#"uU|uLD
..t$F&hUp%A?bIw*j0By
b*4([G&pX7*'\JZH**pyY-pX"-r
03NRX-1<1h`3<+|==/M>PB:?*4(6@@<2c}qU OU
Jqqt$41bT&QRmoW{I6ar*jA?jc[G&pyY-bB:?24d0.
3Q H)?.gCHAPTER TWORaid F. AnabosiRaid F. Anabosi0
!"#$%&'()*+,-./