1 ************************************************************************; 2 *** EXST7034 Example 1 using PC-SAS - Airline vial breakage ***; 3 *** Problem from Neter, Kutner, Nachtsheim & Wasserman 1996, #1.21 ***; 4 ************************************************************************; 5 OPTIONS LS=88 PS=256 NOCENTER NODATE NONUMBER; 6 DATA ONE; INFILE CARDS MISSOVER; 7 TITLE1 'EXST7034 - Example 1 : Airline vial breakage - NKNW Example 1.21'; 8 * LABEL X = 'Breakage per 1000 vials'; 9 * LABEL Y = 'Number of airline transfers'; 10 INPUT X Y; 11 ANOTHERX = X; 12 CARDS; NOTE: The data set WORK.ONE has 11 observations and 3 variables. NOTE: DATA statement used: real time 0.06 seconds cpu time 0.06 seconds 24 ; 25 PROC SORT DATA=ONE; BY X Y; RUN; NOTE: There were 11 observations read from the data set WORK.ONE. NOTE: The data set WORK.ONE has 11 observations and 3 variables. NOTE: PROCEDURE SORT used: real time 0.04 seconds cpu time 0.04 seconds 26 PROC PRINT DATA=ONE; TITLE2 'Raw Data Listing'; RUN; NOTE: There were 11 observations read from the data set WORK.ONE. NOTE: The PROCEDURE PRINT printed page 1. NOTE: PROCEDURE PRINT used: real time 0.04 seconds cpu time 0.04 seconds EXST7034 - Example 1 : Airline vial breakage - NKNW Example 1.21 Raw Data Listing Obs X Y ANOTHERX 1 0 8 0 2 0 9 0 3 0 11 0 4 0 12 0 5 1 13 1 6 1 15 1 7 1 16 1 8 2 17 2 9 2 19 2 10 3 22 3 11 4 . 4 27 PROC REG DATA=ONE lineprinter; TITLE2 'Regression Models done with SAS REG procedure'; 28 MODEL Y = X / XPX I P CLM CLI R CLB alpha=0.01; 29 TEST X = 5; 30 OUTPUT OUT=RESIDS PREDICTED=P RESIDUAL=E; RUN; NOTE: 11 observations read. NOTE: 1 observations have missing values. NOTE: 10 observations used in computations. 31 OPTIONS PS=35 ls=80; PLOT Y*X='O' PREDICTED.*X='P'/ OVERLAY; 32 PLOT RESIDUAL.*X='e'; RUN; NOTE: The data set WORK.RESIDS has 11 observations and 5 variables. NOTE: The PROCEDURE REG printed pages 2-7. NOTE: PROCEDURE REG used: real time 0.11 seconds cpu time 0.09 seconds EXST7034 - Example 1 : Airline vial breakage - NKNW Example 1.21 Regression Models done with SAS REG procedure The REG Procedure Model: MODEL1 Model Crossproducts X'X X'Y Y'Y Variable Intercept X Y Intercept 10 10 142 X 10 20 182 Y 142 182 2194 Dependent Variable: Y X'X Inverse, Parameter Estimates, and SSE Variable Intercept X Y Intercept 0.2 -0.1 10.2 X -0.1 0.1 4 Y 10.2 4 17.6 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 160.00000 160.00000 72.73 <.0001 Error 8 17.60000 2.20000 Corrected Total 9 177.60000 Root MSE 1.48324 R-Square 0.9009 Dependent Mean 14.20000 Adj R-Sq 0.8885 Coeff Var 10.44535 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| 99% Confidence Limits Intercept 1 10.20000 0.66332 15.38 <.0001 7.97429 12.42571 X 1 4.00000 0.46904 8.53 <.0001 2.42618 5.57382 Output Statistics Dep Var Predicted Std Error Obs Y Value Mean Predict 99% CL Mean 99% CL Predict 1 8.0000 10.2000 0.6633 7.9743 12.4257 4.7481 15.6519 2 9.0000 10.2000 0.6633 7.9743 12.4257 4.7481 15.6519 3 11.0000 10.2000 0.6633 7.9743 12.4257 4.7481 15.6519 4 12.0000 10.2000 0.6633 7.9743 12.4257 4.7481 15.6519 5 13.0000 14.2000 0.4690 12.6262 15.7738 8.9802 19.4198 6 15.0000 14.2000 0.4690 12.6262 15.7738 8.9802 19.4198 7 16.0000 14.2000 0.4690 12.6262 15.7738 8.9802 19.4198 8 17.0000 18.2000 0.6633 15.9743 20.4257 12.7481 23.6519 9 19.0000 18.2000 0.6633 15.9743 20.4257 12.7481 23.6519 10 22.0000 22.2000 1.0488 18.6808 25.7192 16.1046 28.2954 11 . 26.2000 1.4832 21.2232 31.1768 19.1617 33.2383 Output Statistics Std Error Student Cook's Obs Residual Residual Residual -2-1 0 1 2 D 1 -2.2000 1.327 -1.658 | ***| | 0.344 2 -1.2000 1.327 -0.905 | *| | 0.102 3 0.8000 1.327 0.603 | |* | 0.045 4 1.8000 1.327 1.357 | |** | 0.230 5 -1.2000 1.407 -0.853 | *| | 0.040 6 0.8000 1.407 0.569 | |* | 0.018 7 1.8000 1.407 1.279 | |** | 0.091 8 -1.2000 1.327 -0.905 | *| | 0.102 9 0.8000 1.327 0.603 | |* | 0.045 10 -0.2000 1.049 -0.191 | | | 0.018 11 . . . . Sum of Residuals 0 Sum of Squared Residuals 17.60000 Predicted Residual SS (PRESS) 25.85290 28 MODEL Y = X / XPX I P CLM CLI R CLB alpha=0.01; 29 TEST X = 5; The REG Procedure Model: MODEL1 Test 1 Results for Dependent Variable Y Mean Source DF Square F Value Pr > F Numerator 1 10.00000 4.55 0.0656 Denominator 8 2.20000 31 OPTIONS PS=35 ls=80; PLOT Y*X='O' PREDICTED.*X='P'/ OVERLAY; 32 PLOT RESIDUAL.*X='e'; RUN; The REG Procedure Model: MODEL1 Dependent Variable: Y --+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-- 25 + + | | | | | ? | | | 20 + + | O | Y | P | | O | | O | 15 + O + | P | | O | | O | | O | 10 + P + | O | | O | | | | | 5 + + --+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-- 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 X Dependent Variable: Y ----+----+----+----+----+----+----+----+----+----+----+----+----+---- 2 + + | e e | | | | | 1 + + | e e e | R RESIDUAL | | e | | s 0 + + i | e | d | | u | | a -1 + + l | e e e | | | | | -2 + + | e | | | | | -3 + + ----+----+----+----+----+----+----+----+----+----+----+----+----+---- 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 X 33 PROC PLOT DATA=RESIDS; PLOT E*X='x' / VREF=0; RUN; 34 OPTIONS PS=256 ls=88; NOTE: There were 11 observations read from the data set WORK.RESIDS. NOTE: The PROCEDURE PLOT printed page 8. NOTE: PROCEDURE PLOT used: real time 0.01 seconds cpu time 0.01 seconds EXST7034 - Example 1 : Airline vial breakage - NKNW Example 1.21 Regression Models done with SAS REG procedure Plot of E*X. Symbol used is 'x'. | | 1.8 + x x | | | | 0.8 + x x x R | e | s | i |------------------------------------------------------------- d -0.2 + x u | a | l | | -1.2 + x x x | | | | -2.2 + x | ---+-------------+-------------+-------------+-------------+-- 0 1 2 3 4 X 36 PROC UNIVARIATE DATA=RESIDS PLOT NORMAL; VAR Y E; RUN; NOTE: The PROCEDURE UNIVARIATE printed pages 9-10. NOTE: PROCEDURE UNIVARIATE used: real time 0.02 seconds cpu time 0.02 seconds NOTE: 1 obs had missing values. EXST7034 - Example 1 : Airline vial breakage - NKNW Example 1.21 Regression Models done with SAS REG procedure The UNIVARIATE Procedure Variable: Y Moments N 10 Sum Weights 10 Mean 14.2 Sum Observations 142 Std Deviation 4.44222167 Variance 19.7333333 Skewness 0.30002336 Kurtosis -0.6155187 Uncorrected SS 2194 Corrected SS 177.6 Coeff Variation 31.2832512 Std Error Mean 1.40475383 Basic Statistical Measures Location Variability Mean 14.20000 Std Deviation 4.44222 Median 14.00000 Variance 19.73333 Mode . Range 14.00000 Interquartile Range 6.00000 Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 10.10853 Pr > |t| <.0001 Sign M 5 Pr >= |M| 0.0020 Signed Rank S 27.5 Pr >= |S| 0.0020 Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.977983 Pr < W 0.9535 Kolmogorov-Smirnov D 0.106472 Pr > D >0.1500 Cramer-von Mises W-Sq 0.016073 Pr > W-Sq >0.2500 Anderson-Darling A-Sq 0.127378 Pr > A-Sq >0.2500 Quantiles (Definition 5) Quantile Estimate 100% Max 22.0 99% 22.0 95% 22.0 90% 20.5 75% Q3 17.0 50% Median 14.0 25% Q1 11.0 10% 8.5 5% 8.0 1% 8.0 0% Min 8.0 Extreme Observations ----Lowest---- ----Highest--- Value Obs Value Obs 8 1 15 6 9 2 16 7 11 3 17 8 12 4 19 9 13 5 22 10 Missing Values -----Percent Of----- Missing Missing Value Count All Obs Obs . 1 9.09 100.00 Stem Leaf # Boxplot 22 0 1 | 20 | 18 0 1 | 16 00 2 +-----+ 14 0 1 *--+--* 12 00 2 | | 10 0 1 +-----+ 8 00 2 | ----+----+----+----+ Normal Probability Plot 23+ * +++++ | ++++ | +*+++ | *++*+ | +*+++ | *++* | ++*++ 9+ * +++* +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 EXST7034 - Example 1 : Airline vial breakage - NKNW Example 1.21 Regression Models done with SAS REG procedure The UNIVARIATE Procedure Variable: E (Residual) Moments N 10 Sum Weights 10 Mean 0 Sum Observations 0 Std Deviation 1.3984118 Variance 1.95555556 Skewness -0.1340807 Kurtosis -1.3788555 Uncorrected SS 17.6 Corrected SS 17.6 Coeff Variation . Std Error Mean 0.44221664 Basic Statistical Measures Location Variability Mean 0.000000 Std Deviation 1.39841 Median 0.300000 Variance 1.95556 Mode . Range 4.00000 Interquartile Range 2.00000 Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 0 Pr > |t| 1.0000 Sign M 0 Pr >= |M| 1.0000 Signed Rank S -1.5 Pr >= |S| 0.9219 Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.907336 Pr < W 0.2632 Kolmogorov-Smirnov D 0.216365 Pr > D >0.1500 Cramer-von Mises W-Sq 0.075596 Pr > W-Sq 0.2184 Anderson-Darling A-Sq 0.449452 Pr > A-Sq 0.2244 Quantiles (Definition 5) Quantile Estimate 100% Max 1.8 99% 1.8 95% 1.8 90% 1.8 75% Q3 0.8 50% Median 0.3 25% Q1 -1.2 10% -1.7 5% -2.2 1% -2.2 0% Min -2.2 Extreme Observations ----Lowest---- ----Highest--- Value Obs Value Obs -2.2 1 0.8 9 -1.2 8 0.8 6 -1.2 5 0.8 3 -1.2 2 1.8 7 -0.2 10 1.8 4 Missing Values -----Percent Of----- Missing Missing Value Count All Obs Obs . 1 9.09 100.00 Stem Leaf # Boxplot 1 88 2 | 1 | 0 888 3 +-----+ 0 *--+--* -0 2 1 | | -0 | | -1 222 3 +-----+ -1 | -2 2 1 | ----+----+----+----+ Normal Probability Plot 1.75+ *++++* | +++ | * *++* | ++++ -0.25+ ++* | ++++ | *++* * | ++++ -2.25+ ++* +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 52 GOPTIONS DEVICE=cgm GSFMODE=REPLACE GSFNAME=OUT1 NOPROMPT noROTATE; 53 FILENAME OUT1 53 ! 'C:\Geaghan\EXST\EXST7034New\Fall2002\SAS\ArilineVials-Nknw1-21.cgm'; NOTE: The PROCEDURE REG printed page 18. NOTE: PROCEDURE REG used: real time 0.17 seconds cpu time 0.11 seconds 54 PROC GPLOT DATA=one; TITLE1 F=SWISS H=1 'Vial breakage'; 55 PLOT y*x=1 y*x=2 y*x=3 / HAXIS=AXIS1 VAXIS=AXIS2 overlay; 56 AXIS1 LABEL=(F=SWISS H=1 'Transfers') WIDTH=5 MINOR=(N=4) 57 VALUE=(F=SWISS H=1) ORDER=0 TO 4 BY 1; 58 AXIS2 LABEL=(F=SWISS H=1 'Breakage (number per 1000)') WIDTH=6 59 VALUE=(F=SWISS H=1) MINOR=(N=5) ORDER=0 TO 30 BY 5; 60 SYMBOL1 C=red V=J I=none W=1 H=1 F=SPECIAL MODE=INCLUDE; 61 SYMBOL2 C=blue V=none I=RLcli95 L=1 W=2 H=1 MODE=INCLUDE; 62 SYMBOL3 C=orange V=none I=RLclm95 L=1 W=3 H=1 MODE=INCLUDE; 63 RUN; WARNING: The axis frame outline was drawn with line width 6 as specified on the left vertical axis. Any other axis line widths were ignored. NOTE: 1 observation(s) contained a MISSING value for the Y * X request. NOTE: Regression equation : Y = 10.2 + 4*X. NOTE: 1 observation(s) contained a MISSING value for the Y * X request. NOTE: Regression equation : Y = 10.2 + 4*X. NOTE: 1 observation(s) contained a MISSING value for the Y * X request. NOTE: 172 RECORDS WRITTEN TO C:\Geaghan\EXST\EXST7034New\Fall2002\SAS\ArilineVials-Nknw1-21.cgm 64 quit; NOTE: There were 11 observations read from the data set WORK.ONE. NOTE: PROCEDURE GPLOT used: real time 0.33 seconds cpu time 0.13 seconds