/* This files shows how how the fitted model of Y regressed on X1 represents the correlation or the explained part between two variables. The residuals represent the unexplained part of the relationship. The unexplained part is "pure" in the statistical sense that the relationship between Y and X1 has been removed. So, if X2 is regressed on the residuals of Y|X1 we estimate b-hat2. */ capture clear input int income byte educ byte age 6281 4 21 10516 4 37 6898 6 19 8212 6 23 11744 6 31 8618 8 18 10011 8 23 12405 8 30 14664 8 39 7472 10 18 11598 10 27 15336 10 36 10186 11 20 9771 12 18 12444 12 19 14213 12 25 16908 12 37 18347 12 40 19546 13 45 12660 14 18 16326 14 27 12772 15 19 17218 15 30 12599 16 17 14852 16 19 19138 16 33 21779 16 50 16428 17 24 20018 17 35 16526 18 20 19414 18 28 18822 20 25 end regress income educ age /* Estimate multiple regression model */ tempvar YX1resid YX2resid /* Set variables for residuals */ regress income educ /* Estimate bivariate model Y|X1 */ predict `YX1resid', residuals /* Calculate residuals */ regress `YX1resid' age /* Estimate YX1resid|X2 */ regress income age /* Estimate bivariate model Y|X1 */ predict `YX2resid', residuals /* Calculate residuals */ regress `YX2resid' educ /* Estimate YX1resid|X2 */ corr income educ age #delimit ; twoway (lfitci educ age, stdp clc(maroon) clw(thick)) (scatter educ age, ms(O) mfc(emidblue) mlc(black)), legend(off) ytitle("Education") xtitle("Age") title("Scatterplot of Education and Age") subtitle("Multiple Regression Example") xlabel(20(10)50) xtick(20(5)50) xmtick(17(1)50) ytick(4(1)20) name(g1, replace) nodraw ; #delimit cr #delimit ; twoway (lfitci income educ, stdp clc(maroon) clw(thick)) (scatter income educ, ms(O) mfc(emidblue) mlc(black)), legend(off) ytitle("Income") xtitle("Education") title("Scatterplot of Income and Education") subtitle("Multiple Regression Example") name(g2, replace) nodraw ; #delimit cr #delimit ; twoway (lfitci income age, stdp clc(maroon) clw(thick)) (scatter income age, ms(O) mfc(emidblue) mlc(black)), legend(off) ytitle("Income") xtitle("Age") title("Scatterplot of Income and Age") subtitle("Multiple Regression Example") xlabel(20(10)50) xtick(20(5)50) xmtick(17(1)50) ytick(5000(2500)25000) name(g3, replace) nodraw ; graph combine g1 g2 g3, nocopies name(combined) ; #delimit cr