descriptive statistics numerical methods:描述性統(tǒng)計數(shù)值方法_第1頁
已閱讀1頁,還剩39頁未讀 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權,請進行舉報或認領

文檔簡介

1、Chapter 3,Descriptive Statistics: Numerical Methods,Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.,McGraw-Hill/Irwin,3-2,Descriptive Statistics,3.1Describing Central Tendency3.2Measures

2、of Variation3.3Percentiles, Quartiles and Box-and-Whiskers Displays3.4Covariance, Correlation, and the Least Square Line (Optional)3.5Weighted Means and Grouped Data (Optional)3.6The Geometric Mean (Optional),3-3

3、,3.1 Describing Central Tendency,In addition to describing the shape of a distribution, want to describe the data set’s central tendencyA measure of central tendency represents the center or middle of the dataPopulatio

4、n mean (μ) is average of the population measurementsPopulation parameter: a number calculated from all the population measurements that describes some aspect of the populationSample statistic: a number calculated using

5、 the sample measurements that describes some aspect of the sample,LO3-1: Compute and interpret the mean, median, and mode.,3-4,Measures of Central Tendency,Mean, ?The average or expected value Median, MdThe value of t

6、he middle point of the ordered measurementsMode, MoThe most frequent value,LO3-1,3-5,The Mean,LO3-1,3-6,The Sample Mean,and is a point estimate of the population mean ?It is the value to expect, on average and in the

7、long run,For a sample of size n, the sample mean (x) is defined as,LO3-1,3-7,Example 3.1 Car Mileage Case: Estimating Mileage,Sample mean for first five car mileages from Table 3.130.8, 31.7, 30.1, 31.6, 32.1,LO3-1,3-8

8、,The Median,The median Md is a value such that 50% of all measurements, after having been arranged in numerical order, lie above (or below) itIf the number of measurements is odd, the median is the middlemost measuremen

9、t in the orderingIf the number of measurements is even, the median is the average of the two middlemost measurements in the ordering,LO3-1,3-9,Example 3.1 The Car Mileage Case,First five observations from Table 3.1:30

10、.8, 31.7, 30.1, 31.6, 32.1In order: 30.1, 30.8, 31.6, 31.7, 32.1There is an odd so median is one in middle, or 31.6,LO3-1,3-10,The Mode,The mode Mo of a population or sample of measurements is the measurement that oc

11、curs most frequentlyModes are the values that are observed “most typically”Sometimes higher frequencies at two or more valuesIf there are two modes, the data is bimodalIf more than two modes, the data is multimodalW

12、hen data are in classes, the class with the highest frequency is the modal classThe tallest box in the histogram,LO3-1,3-11,Relationships Among Mean, Median and Mode,LO3-1,Figure 3.3,3-12,3.2 Measures of Variation,Knowi

13、ng the measures of central tendency is not enoughBoth of the distributions below have identical measures of central tendency,LO3-2: Compute and interpret the range, variance, and standard deviation.,Figure 3.13,3-13,Mea

14、sures of Variation,RangeLargest minus the smallest measurementVarianceThe average of the squared deviations of all the population measurements from the population meanStandardThe square root of the populationDevi

15、ation variance,LO3-2,3-14,The Range,Largest minus smallestMeasures the interval spanned by all the dataFor the left side of Figure 3.13, largest is 5 and smallest is 3Range is 5 – 3 = 2 days,LO3-2,3-15,Population

16、Variance and Standard Deviation,The population variance (σ2) is the average of the squared deviations of the individual population measurements from the population mean (µ)The population standard deviation (σ) is

17、the positive square root of the population variance,LO3-2,3-16,Variance,For a population of size N, the population variance σ2 is:For a sample of size n, the sample variance s2 is:,LO3-2,3-17,Standard Deviation,Popul

18、ation standard deviation (σ):Sample standard deviation (s):,LO3-2,3-18,Example: Chris’s Class Sizes This Semester,Data points are: 60, 41, 15, 30, 34Mean is 36 (180/5)Variance is:Standard deviation is:,LO3-2,

19、3-19,Example: Sample Variance and Standard Deviation,Example 3.6: data for first five car mileages from Table 3.1: 30.8, 31.7, 30.1, 31.6, 32.1The sample mean is 31.26The variance and standard deviation are:,LO3-2,3-20

20、,The Empirical Rule for Normal Populations,If a population has mean µ and standard deviation σ and is described by a normal curve, then68.26% of the population measurements lie within one standard deviation of the

21、mean: [µ-σ, µ+σ]95.44% lie within two standard deviations of the mean: [µ-2σ, µ+2σ]99.73% lie within three standard deviations of the mean: [µ-3σ, µ+3σ],LO3-3: Use the EmpiricalRule and C

22、hebyshev’s Theorem to describe variation.,3-21,Chebyshev’s Theorem,Let µ and σ be a population’s mean and standard deviation, then for any value k > 1At least 100(1 - 1/k2)% of the population measurements lie in

23、 the interval [µ-kσ, µ+kσ]Only practical for non-mound-shaped distribution population that is not very skewed,LO3-3,3-22,z Scores,For any x in a population or sample, the associated z score isThe z score i

24、s the number of standard deviations that x is from the meanA positive z score is for x above (greater than) the meanA negative z score is for x below (less than) the mean,LO3-3,3-23,Coefficient of Variation,Measures th

25、e size of the standard deviation relative to the size of the meanUsed to:Compare the relative variabilities of values about the meanCompare the relative variability of populations or samples with different means an

26、d different standard deviationsMeasure risk,LO3-3,3-24,3.3 Percentiles, Quartiles, and Box-and-Whiskers Displays,For a set of measurements arranged in increasing order, the pth percentile is a value such that p percent

27、of the measurements fall at or below the value and (100-p) percent of the measurements fall at or above the valueThe first quartile Q1 is the 25th percentile The second quartile (median) is the 50th percentileThe thi

28、rd quartile Q3 is the 75th percentileThe interquartile range IQR is Q3 - Q1,,LO3-4: Compute and interpret percentiles, quartiles, and box-and-whiskers displays.,3-25,Calculating Percentiles,Arrange the measurements in i

29、ncreasing orderCalculate the index i=(p/100)n where p is the percentile to find(a) If i is not an integer, round up and the next integer greater than i denotes the pth percentile(b) If i is an integer, the pth percent

30、ile is the average of the measurements in the i and i+1 positions,LO3-4,3-26,Percentile Example,i=(10/100)12=1.2Not an integer so round up to 210th percentile is in the second position so 11,070i=(25/100)12=3Integer

31、 so average values in positions 3 and 425th percentile (18,211+26,817)/2 or 22,514,LO3-4,3-27,Five Number Summary,The smallest measurementThe first quartile, Q1The median, MdThe third quartile, Q3The largest measure

32、ment,Displayed visually using a box-and-whiskers plot,LO3-4,3-28,Box-and-Whiskers Plots,The box plots the: First quartile, Q1Median, MdThird quartile, Q3Inner fencesOuter fences,Inner fencesLocated 1.5?IQR away fro

33、m the quartiles:Q1 – (1.5 ? IQR)Q3 + (1.5 ? IQR)Outer fencesLocated 3?IQR away from the quartiles:Q1 – (3 ? IQR)Q3 + (3 ? IQR),LO3-4,3-29,Box-and-Whiskers Plots Continued,The “whiskers” are dashed lines that plot t

34、he range of the dataA dashed line drawn from the box below Q1 down to the smallest measurementAnother dashed line drawn from the box above Q3 up to the largest measurement,LO3-4,Figures 3.17 and 3.18,3-30,Outliers,Outl

35、iers are measurements that are very different from other measurementsThey are either much larger or much smaller than most of the other measurementsOutliers lie beyond the limits of the box-and-whiskers plotMeasuremen

36、ts less than the lower limit or greater than the upper limit,LO3-4,3-31,3.4 Covariance, Correlation, and the Least Squares Line (Optional),When points on a scatter plot seem to fluctuate around a straight line, there is

37、a linear relationship between x and yA measure of the strength of a linear relationship is the covariance sxy,LO3-5: Compute and interpret covariance, correlation, and the least squares line (Optional).,3-32,Covariance,

38、A positive covariance indicates a positive linear relationship between x and yAs x increases, y increasesA negative covariance indicates a negative linear relationship between x and yAs x increases, y decreases,LO3-5,

39、3-33,Correlation Coefficient,Magnitude of covariance does not indicate the strength of the relationshipMagnitude depends on the unit of measurement used for the dataCorrelation coefficient (r) is a measure of the stren

40、gth of the relationship that does not depend on the magnitude of the data,LO3-5,3-34,Correlation Coefficient Continued,Sample correlation coefficient r is always between -1 and +1Values near -1 show strong negative corr

41、elationValues near 0 show no correlationValues near +1 show strong positive correlationSample correlation coefficient is the point estimate for the population correlation coefficient ρ,LO3-5,3-35,Least Squares Line,If

42、 there is a linear relationship between x and y, might wish to predict y on the basis of xThis requires the equation of a line describing the linear relationshipLine is calculated based on least squares lineDiscussed

43、in detail in a later chapterNeed to find slope (b1) and y-intercept (b0),LO3-5,3-36,3.5 Weighted Means and Grouped Data (Optional),Sometimes, some measurements are more important than othersAssign numerical “weights” t

44、o the dataWeights measure relative importance of the valueCalculate weighted mean as where wi is the weight assigned to the ith measurement xi,LO3-6: Compute and interpret weighted means and the mean and standard

45、deviation of grouped data (Optional).,3-37,Descriptive Statistics for Grouped Data,Data already categorized into a frequency distribution or a histogram is called grouped dataCan calculate the mean and variance even whe

46、n the raw data is not availableCalculations are slightly different for data from a sample and data from a population,LO3-6,3-38,Descriptive Statistics for Grouped Data (Sample),Sample mean for grouped data:Sample var

47、iance for grouped data:fi is the frequency for class i Mi is the midpoint of class in = Σfi = sample size,LO3-6,3-39,Descriptive Statistics for Grouped Data (Population),Population mean for grouped data:Populatio

48、n variance for grouped data:fi is the frequency for class i Mi is the midpoint of class iN = Σfi = population size,LO3-6,3-40,3.6 The Geometric Mean (Optional),For rates of return of an investment, use the geometri

49、c mean to give the correct wealth at the end of the investmentSuppose the rates of return (expressed as decimal fractions) are R1, R2, …, Rn for periods 1, 2, …, nThe mean of all these returns is the calculated as the

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論