------------------------------------------------------------------------------------ name: log: /Users/denisardalves/Desktop/EAE 324 2017/DATA SET/chap06Wooldridge.txt log type: text opened on: 6 Jun 2017, 10:44:29 . . * open data . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/BWGHT.DTA", clear . . histogram bwght, normal bin(20) (bin=20, start=23, width=12.4) . reg bwght cigs faminc Source | SS df MS Number of obs = 1,388 -------------+---------------------------------- F(2, 1385) = 21.27 Model | 17126.2088 2 8563.10442 Prob > F = 0.0000 Residual | 557485.511 1,385 402.516614 R-squared = 0.0298 -------------+---------------------------------- Adj R-squared = 0.0284 Total | 574611.72 1,387 414.283864 Root MSE = 20.063 ------------------------------------------------------------------------------ bwght | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- cigs | -.4634075 .0915768 -5.06 0.000 -.6430518 -.2837633 faminc | .0927647 .0291879 3.18 0.002 .0355075 .1500219 _cons | 116.9741 1.048984 111.51 0.000 114.9164 119.0319 ------------------------------------------------------------------------------ . ereturn list scalars: e(N) = 1388 e(df_m) = 2 e(df_r) = 1385 e(F) = 21.27391543831015 e(r2) = .0298048373275821 e(rmse) = 20.06281671085743 e(mss) = 17126.20883339178 e(rss) = 557485.5109072425 e(r2_a) = .0284038334825678 e(ll) = -6130.414104234555 e(ll_0) = -6151.413176423765 e(rank) = 3 macros: e(cmdline) : "regress bwght cigs faminc" e(title) : "Linear regression" e(marginsok) : "XB default" e(vce) : "ols" e(depvar) : "bwght" e(cmd) : "regress" e(properties) : "b V" e(predict) : "regres_p" e(model) : "ols" e(estat_cmd) : "regress_estat" matrices: e(b) : 1 x 3 e(V) : 3 x 3 functions: e(sample) . estimates store bwght . reg bwghtlbs cigs faminc Source | SS df MS Number of obs = 1,388 -------------+---------------------------------- F(2, 1385) = 21.27 Model | 66.8992533 2 33.4496266 Prob > F = 0.0000 Residual | 2177.67778 1,385 1.57233052 R-squared = 0.0298 -------------+---------------------------------- Adj R-squared = 0.0284 Total | 2244.57703 1,387 1.61829634 Root MSE = 1.2539 ------------------------------------------------------------------------------ bwghtlbs | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- cigs | -.028963 .0057236 -5.06 0.000 -.0401907 -.0177352 faminc | .0057978 .0018242 3.18 0.002 .0022192 .0093764 _cons | 7.310883 .0655615 111.51 0.000 7.182273 7.439494 ------------------------------------------------------------------------------ . ereturn list scalars: e(N) = 1388 e(df_m) = 2 e(df_r) = 1385 e(F) = 21.27391543831015 e(r2) = .0298048373275821 e(rmse) = 1.25392604442859 e(mss) = 66.89925325543663 e(rss) = 2177.677776981416 e(r2_a) = .0284038334825678 e(ll) = -2282.060957765739 e(ll_0) = -2303.060029954949 e(rank) = 3 macros: e(cmdline) : "regress bwghtlbs cigs faminc" e(title) : "Linear regression" e(marginsok) : "XB default" e(vce) : "ols" e(depvar) : "bwghtlbs" e(cmd) : "regress" e(properties) : "b V" e(predict) : "regres_p" e(model) : "ols" e(estat_cmd) : "regress_estat" matrices: e(b) : 1 x 3 e(V) : 3 x 3 functions: e(sample) . estimates store bwghtlbs . reg bwght packs faminc Source | SS df MS Number of obs = 1,388 -------------+---------------------------------- F(2, 1385) = 21.27 Model | 17126.2089 2 8563.10445 Prob > F = 0.0000 Residual | 557485.511 1,385 402.516614 R-squared = 0.0298 -------------+---------------------------------- Adj R-squared = 0.0284 Total | 574611.72 1,387 414.283864 Root MSE = 20.063 ------------------------------------------------------------------------------ bwght | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- packs | -9.268151 1.831536 -5.06 0.000 -12.86104 -5.675266 faminc | .0927647 .0291879 3.18 0.002 .0355075 .1500219 _cons | 116.9741 1.048984 111.51 0.000 114.9164 119.0319 ------------------------------------------------------------------------------ . ereturn list scalars: e(N) = 1388 e(df_m) = 2 e(df_r) = 1385 e(F) = 21.27391551108767 e(r2) = .0298048374265047 e(rmse) = 20.06281670983461 e(mss) = 17126.20889023389 e(rss) = 557485.5108504004 e(r2_a) = .0284038335816332 e(ll) = -6130.414104163794 e(ll_0) = -6151.413176423765 e(rank) = 3 macros: e(cmdline) : "regress bwght packs faminc" e(title) : "Linear regression" e(marginsok) : "XB default" e(vce) : "ols" e(depvar) : "bwght" e(cmd) : "regress" e(properties) : "b V" e(predict) : "regres_p" e(model) : "ols" e(estat_cmd) : "regress_estat" matrices: e(b) : 1 x 3 e(V) : 3 x 3 functions: e(sample) . estimates store bwght2 . * para montar uma bela tabela de regressões para trabalhos . esttab bwght bwghtlbs bwght2, b(%10.4f) se scalars(N r2 rmse rss F) mtitles keep( > cigs packs faminc _cons) title("Table 6.1 Effects of Data Scaling") Table 6.1 Effects of Data Scaling ------------------------------------------------------------ (1) (2) (3) bwght bwghtlbs bwght2 ------------------------------------------------------------ cigs -0.4634*** -0.0290*** (0.0916) (0.0057) faminc 0.0928** 0.0058** 0.0928** (0.0292) (0.0018) (0.0292) packs -9.2682*** (1.8315) _cons 116.9741*** 7.3109*** 116.9741*** (1.0490) (0.0656) (1.0490) ------------------------------------------------------------ N 1388 1388 1388 r2 0.0298 0.0298 0.0298 rmse 20.0628 1.2539 20.0628 rss 5.575e+05 2177.6778 5.575e+05 F 21.2739 21.2739 21.2739 ------------------------------------------------------------ Standard errors in parentheses * p<0.05, ** p<0.01, *** p<0.001 . /*Para salvar a tabela para coloca-la em um documento em Word, use o seguinte coma > ndo:*/ . quietly esttab bwght bwghtlbs bwght2 using TabelaScaling.rtf, replace b(%10.4f) se > scalars(N r2 rmse rss F) mtitles keep(cigs packs faminc _cons) title("Table 6.1 > Effects of Data Scaling") . . *Exemplo: Efeito de Poluição no Preço de Casas . *dados . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/HPRICE2.DTA", clear . . sum price nox crime rooms dist stratio Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- price | 506 22511.51 9208.856 5000 50001 nox | 506 5.549783 1.158395 3.85 8.71 crime | 506 3.611536 8.590247 .006 88.976 rooms | 506 6.284051 .7025938 3.56 8.78 dist | 506 3.795751 2.106137 1.13 12.13 -------------+--------------------------------------------------------- stratio | 506 18.45929 2.16582 12.6 22 . *regressão hedônica . reg price nox crime rooms dist rooms stratio note: rooms omitted because of collinearity Source | SS df MS Number of obs = 506 -------------+---------------------------------- F(5, 500) = 174.47 Model | 2.7223e+10 5 5.4445e+09 Prob > F = 0.0000 Residual | 1.5603e+10 500 31205611.6 R-squared = 0.6357 -------------+---------------------------------- Adj R-squared = 0.6320 Total | 4.2826e+10 505 84803032 Root MSE = 5586.2 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nox | -2706.433 354.0869 -7.64 0.000 -3402.114 -2010.751 crime | -153.601 32.92883 -4.66 0.000 -218.2969 -88.90504 rooms | 6735.498 393.6037 17.11 0.000 5962.177 7508.819 dist | -1026.806 188.1079 -5.46 0.000 -1396.386 -657.227 rooms | 0 (omitted) stratio | -1149.204 127.4287 -9.02 0.000 -1399.566 -898.8422 _cons | 20871.13 5054.599 4.13 0.000 10940.26 30802 ------------------------------------------------------------------------------ . . . . reg price nox crime rooms dist stratio Source | SS df MS Number of obs = 506 -------------+---------------------------------- F(5, 500) = 174.47 Model | 2.7223e+10 5 5.4445e+09 Prob > F = 0.0000 Residual | 1.5603e+10 500 31205611.6 R-squared = 0.6357 -------------+---------------------------------- Adj R-squared = 0.6320 Total | 4.2826e+10 505 84803032 Root MSE = 5586.2 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nox | -2706.433 354.0869 -7.64 0.000 -3402.114 -2010.751 crime | -153.601 32.92883 -4.66 0.000 -218.2969 -88.90504 rooms | 6735.498 393.6037 17.11 0.000 5962.177 7508.819 dist | -1026.806 188.1079 -5.46 0.000 -1396.386 -657.227 stratio | -1149.204 127.4287 -9.02 0.000 -1399.566 -898.8422 _cons | 20871.13 5054.599 4.13 0.000 10940.26 30802 ------------------------------------------------------------------------------ . /*A partir de qual número de comodos, tudo o mais constante, tem inícioa queda no > preço das casas? Para se responder essa questão usa-se o quadrado de rooms, depois > , deriva-se preço com relação a rooms e iguala-se a zero, para se obter o número d > e comodos estimado, a partir do qual 0 preço começa a cair, sempre dentro da hipót > ese da hipótese ceteris paribus*/ . gen rooms2=rooms*rooms . gen ldist=log(dist) . reg lprice lnox ldist rooms rooms2 stratio Source | SS df MS Number of obs = 506 -------------+---------------------------------- F(5, 500) = 151.77 Model | 50.9872375 5 10.1974475 Prob > F = 0.0000 Residual | 33.5949875 500 .067189975 R-squared = 0.6028 -------------+---------------------------------- Adj R-squared = 0.5988 Total | 84.582225 505 .167489554 Root MSE = .25921 ------------------------------------------------------------------------------ lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnox | -.901682 .1146869 -7.86 0.000 -1.12701 -.6763544 ldist | -.0867814 .0432807 -2.01 0.045 -.1718159 -.001747 rooms | -.5451128 .1654542 -3.29 0.001 -.8701839 -.2200417 rooms2 | .0622612 .012805 4.86 0.000 .037103 .0874194 stratio | -.0475902 .0058542 -8.13 0.000 -.059092 -.0360884 _cons | 13.38548 .5664732 23.63 0.000 12.27252 14.49844 ------------------------------------------------------------------------------ . *Valor a partir do qual preço das casas muda de . *direção . . display -1*_b[rooms]/(2*_b[rooms2]) 4.3776293 . . * Variação no preço quando romms passa de 5 para 6. . . display 100*(_b[rooms]+2*_b[rooms2]*5) 7.7499072 . . *Variação no preço quando romms passa de 6 para 7 . . display 100*(_b[rooms]+2*_b[rooms2]*6) 20.202144 . *Beta Coefficients . /*Este exemplo analisa o impacto de poluição do ar, captado pela > variável nox, no preço das casas. > O exemplo usa varíáveis transformadas em z-score, o que mensurará > todas as variáveis em termos de desvios padrões. Essa transformaçã > o > é interessante, pois permitirá comparar os coeficientes estimados > já que todos eles estarão refletindo o impacto sobre a variável > dependente na mesma medida: desvio padrão*/ . . sum price nox crime rooms dist stratio Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- price | 506 22511.51 9208.856 5000 50001 nox | 506 5.549783 1.158395 3.85 8.71 crime | 506 3.611536 8.590247 .006 88.976 rooms | 506 6.284051 .7025938 3.56 8.78 dist | 506 3.795751 2.106137 1.13 12.13 -------------+--------------------------------------------------------- stratio | 506 18.45929 2.16582 12.6 22 . *regressão hedônica . reg price nox crime rooms dist rooms stratio note: rooms omitted because of collinearity Source | SS df MS Number of obs = 506 -------------+---------------------------------- F(5, 500) = 174.47 Model | 2.7223e+10 5 5.4445e+09 Prob > F = 0.0000 Residual | 1.5603e+10 500 31205611.6 R-squared = 0.6357 -------------+---------------------------------- Adj R-squared = 0.6320 Total | 4.2826e+10 505 84803032 Root MSE = 5586.2 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nox | -2706.433 354.0869 -7.64 0.000 -3402.114 -2010.751 crime | -153.601 32.92883 -4.66 0.000 -218.2969 -88.90504 rooms | 6735.498 393.6037 17.11 0.000 5962.177 7508.819 dist | -1026.806 188.1079 -5.46 0.000 -1396.386 -657.227 rooms | 0 (omitted) stratio | -1149.204 127.4287 -9.02 0.000 -1399.566 -898.8422 _cons | 20871.13 5054.599 4.13 0.000 10940.26 30802 ------------------------------------------------------------------------------ . gen zprice=(price-22511.51)/9208.856 . gen znox=(nox--5.549783)/ 1.158395 . gen zcrime=(crime-3.611536)/ 8.590247 . gen zrooms=(rooms- 6.284051)/.7025938 . gen zdist=(dist- 3.795751)/2.106137 . gen zstratio=(stratio-18.45929)/2.16582 . *beta regression da regressão hedônica . reg zprice znox zcrime zrooms zdist zstratio Source | SS df MS Number of obs = 506 -------------+---------------------------------- F(5, 500) = 174.47 Model | 321.011242 5 64.2022485 Prob > F = 0.0000 Residual | 183.988783 500 .367977565 R-squared = 0.6357 -------------+---------------------------------- Adj R-squared = 0.6320 Total | 505.000025 505 1.00000005 Root MSE = .60661 ------------------------------------------------------------------------------ zprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- znox | -.340446 .0445411 -7.64 0.000 -.4279567 -.2529352 zcrime | -.1432828 .0307168 -4.66 0.000 -.2036327 -.0829328 zrooms | .5138879 .0300302 17.11 0.000 .454887 .5728887 zdist | -.2348386 .0430217 -5.46 0.000 -.3193642 -.1503129 zstratio | -.2702799 .0299698 -9.02 0.000 -.3291622 -.2113976 _cons | 3.262101 .4276372 7.63 0.000 2.421914 4.102288 ------------------------------------------------------------------------------ . /* o beta coeff. de nox, betaznox = (desv. padrão de nox/desv. padrão de price)*co > ef. nox da reg. multipla usual*/ . scalar beta1nox=( 1.158395/ 9208.856)* -2706.433 . *confira com o coeficiente estimado da variável znox da beta regression . **USO DE LOG . * regressão de preço de casas e poluição em forma log . reg lprice lnox rooms Source | SS df MS Number of obs = 506 -------------+---------------------------------- F(2, 503) = 265.69 Model | 43.4513652 2 21.7256826 Prob > F = 0.0000 Residual | 41.1308598 503 .081771093 R-squared = 0.5137 -------------+---------------------------------- Adj R-squared = 0.5118 Total | 84.582225 505 .167489554 Root MSE = .28596 ------------------------------------------------------------------------------ lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnox | -.7176736 .0663397 -10.82 0.000 -.8480106 -.5873366 rooms | .3059183 .0190174 16.09 0.000 .268555 .3432816 _cons | 9.233738 .1877406 49.18 0.000 8.864885 9.60259 ------------------------------------------------------------------------------ . . . . *PQ USAR LOGY? . /*O uso de log apresenta muitas vantagens. Permite elasticidaws constantes. > Mas, principalmente, quando, em muitos casos, aproxima a variável dependente é do > tipo da normal, tornando o modelo mais próximo das hipóteses CLM.*/ . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/WAGE1.DTA", clear . . . histogram wage, normal bin(20) (bin=20, start=.52999997, width=1.2225) . histogram lwage, normal bin(20) (bin=20, start=-.63487834, width=.19264769) . . . **Exemplo 6.3: Efeito da frequência em aulas sobre o exame final** . . *Este exemplo mostra como usar interação entre variáveis explicativas para análise > de um evento . . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/attend.dta", clear . . summ priGPA Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- priGPA | 680 2.586775 .5447141 .857 3.93 . . gen priGPA2=priGPA*priGPA . gen ACT2=ACT*ACT . gen priatn=priGPA*atndrte . reg stndfnl atndrte priGPA ACT priGPA2 ACT2 priatn Source | SS df MS Number of obs = 680 -------------+---------------------------------- F(6, 673) = 33.25 Model | 152.001032 6 25.3335053 Prob > F = 0.0000 Residual | 512.762536 673 .761905701 R-squared = 0.2287 -------------+---------------------------------- Adj R-squared = 0.2218 Total | 664.763568 679 .979033237 Root MSE = .87287 ------------------------------------------------------------------------------ stndfnl | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- atndrte | -.0067129 .0102321 -0.66 0.512 -.0268035 .0133777 priGPA | -1.62854 .4810025 -3.39 0.001 -2.572986 -.6840939 ACT | -.1280394 .098492 -1.30 0.194 -.321428 .0653492 priGPA2 | .2959046 .1010495 2.93 0.004 .0974945 .4943148 ACT2 | .0045334 .0021764 2.08 0.038 .00026 .0088068 priatn | .0055859 .0043174 1.29 0.196 -.0028913 .0140631 _cons | 2.050293 1.360319 1.51 0.132 -.6206863 4.721272 ------------------------------------------------------------------------------ . *Efeito Parcial da Frequência no Exame Final* . /*O efeito parcial da frequência nas aulas sobre o exame final > (derivada parcial de stndfnl (exame final) com relação a atndrte(frequência > às aulas), onde priGPA é avaliado na média amostral=2.59-Veja mean em summ, acima) > */ . . display _b[atndrte]+_b[priatn]*2.59 .00775457 . . /* Mas fica mais fácil se rodarmos a regressão substituindo a variável priatn > por priatn2 da seguinte forma:*/ . gen priatn2=(priGPA-2.59)*atndrte . . /*usamos agora a variável priatn2 no lugar de priatn como explicativa e o coeficie > nte estimado de atndrte já fornece o impacto sobre a nota do exame final na média > de priGPA = 2.59, além de forbecer o erro padrão do impacto, que é simplesmente o > erro padrão do novo coeficiente estimado de > atndrte da nova regressão*/ . reg stndfnl atndrte priGPA ACT priGPA2 ACT2 priatn2 Source | SS df MS Number of obs = 680 -------------+---------------------------------- F(6, 673) = 33.25 Model | 152.001032 6 25.3335053 Prob > F = 0.0000 Residual | 512.762536 673 .7619057 R-squared = 0.2287 -------------+---------------------------------- Adj R-squared = 0.2218 Total | 664.763568 679 .979033237 Root MSE = .87287 ------------------------------------------------------------------------------ stndfnl | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- atndrte | .0077546 .0026393 2.94 0.003 .0025723 .0129368 priGPA | -1.62854 .4810025 -3.39 0.001 -2.572986 -.6840939 ACT | -.1280394 .098492 -1.30 0.194 -.321428 .0653492 priGPA2 | .2959046 .1010495 2.93 0.004 .0974945 .4943148 ACT2 | .0045334 .0021764 2.08 0.038 .00026 .0088068 priatn2 | .0055859 .0043174 1.29 0.196 -.0028913 .0140631 _cons | 2.050293 1.360319 1.51 0.132 -.6206861 4.721273 ------------------------------------------------------------------------------ . *estatística t: . scalar tc = _b[atndrte]/_se[atndrte] . *pvalue de tc: . scalar phalf = ttail(673,abs(tc)) . scalar ptc = 2*phalf . di "p-value para o teste ho:beta1 = 0 é " ptc p-value para o teste ho:beta1 = 0 é .00341499 . . . **Exemplo SalárioCEO** . . . . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/CEOSAL1.DTA", clear . . *Qual regressão escolher? Não dá para usar R2 ou R2 ajustado . . reg salary sales roe Source | SS df MS Number of obs = 209 -------------+---------------------------------- F(2, 206) = 3.09 Model | 11427511.8 2 5713755.89 Prob > F = 0.0474 Residual | 380305470 206 1846143.06 R-squared = 0.0292 -------------+---------------------------------- Adj R-squared = 0.0197 Total | 391732982 208 1883331.64 Root MSE = 1358.7 ------------------------------------------------------------------------------ salary | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- sales | .0163416 .0088736 1.84 0.067 -.0011532 .0338363 roe | 19.63097 11.07655 1.77 0.078 -2.20697 41.46891 _cons | 830.6313 223.9049 3.71 0.000 389.1924 1272.07 ------------------------------------------------------------------------------ . reg lsalary lsales roe Source | SS df MS Number of obs = 209 -------------+---------------------------------- F(2, 206) = 40.45 Model | 18.8148991 2 9.40744953 Prob > F = 0.0000 Residual | 47.9072642 206 .232559535 R-squared = 0.2820 -------------+---------------------------------- Adj R-squared = 0.2750 Total | 66.7221632 208 .320779631 Root MSE = .48224 ------------------------------------------------------------------------------ lsalary | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lsales | .2750874 .033254 8.27 0.000 .2095257 .3406491 roe | .0178723 .0039551 4.52 0.000 .0100746 .0256699 _cons | 4.362167 .2938776 14.84 0.000 3.782774 4.941561 ------------------------------------------------------------------------------ . . *Será escolhida pelo sinal esperado do coeficiente estimado e pela significância e > statística . . **Exemplo 6.5: Intervalo de Confiânça para Previsão de College GPA (Enfoque do liv > ro do Wooldridge)** . . . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/gpa2.dta", clear . . . . gen hsize2=hsize*hsize . reg colgpa sat hsperc hsize hsize2 Source | SS df MS Number of obs = 4,137 -------------+---------------------------------- F(4, 4132) = 398.02 Model | 499.030504 4 124.757626 Prob > F = 0.0000 Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781 -------------+---------------------------------- Adj R-squared = 0.2774 Total | 1794.19567 4,136 .433799728 Root MSE = .55986 ------------------------------------------------------------------------------ colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- sat | .0014925 .0000652 22.89 0.000 .0013646 .0016204 hsperc | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559 hsize | -.0608815 .0165012 -3.69 0.000 -.0932328 -.0285302 hsize2 | .0054603 .0022698 2.41 0.016 .0010102 .0099104 _cons | 1.492652 .0753414 19.81 0.000 1.344942 1.640362 ------------------------------------------------------------------------------ . . *Previsão de college GPA* . *No livro o Wooldrige define a previsao como teta0hat = beta0hat+ beta1hatc1+...+ > betakhat*ck . scalar teta0hat = _b[_cons]+_b[sat]*1200+_b[hsperc]*30+_b[hsize]*5+_b[hsize2]*25 . display teta0hat 2.7000755 . /* O livro também coloxa a altermativa de substituir betazero = teta0 - beta1*c1-. > ...-betak*ck, substitui betazero na equação e faz-se a regressão de y em (x1-c1),. > ..,(xk-ck) e a caonstante estimada dessa regressão é o tetazerohat, com a vantagem > de termos o erro padrão da constante como o erro padrão do tetazero hat, permitin > do assim a construção do intervalo de confiânça para o tetazero que é o valor prev > isto para y dado x1=c1,...cxk=ck. Este método eu mostro a seguir:*/ . . . gen sat0=sat-1200 . gen hsperc0=hsperc-30 . gen hsize0=hsize-5 . gen hsize20=hsize2-25 . * A constante da regressão será o teta0hat(valor previsto do college GPA) do livr > o e viosto em classe . . reg colgpa sat0 hsperc0 hsize0 hsize20 Source | SS df MS Number of obs = 4,137 -------------+---------------------------------- F(4, 4132) = 398.02 Model | 499.030503 4 124.757626 Prob > F = 0.0000 Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781 -------------+---------------------------------- Adj R-squared = 0.2774 Total | 1794.19567 4,136 .433799728 Root MSE = .55986 ------------------------------------------------------------------------------ colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- sat0 | .0014925 .0000652 22.89 0.000 .0013646 .0016204 hsperc0 | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559 hsize0 | -.0608815 .0165012 -3.69 0.000 -.0932328 -.0285302 hsize20 | .0054603 .0022698 2.41 0.016 .0010102 .0099104 _cons | 2.700075 .0198778 135.83 0.000 2.661104 2.739047 ------------------------------------------------------------------------------ . . /*Outra forma de se estimar o Int Conf para a Previsão, > observe que continua como se fosse dentro da amostra, mas é > uma forma diferente, mas que dá os mesmos valores obtidos acima, > confiram para ver*/ . . . reg colgpa sat hsperc hsize hsize2 Source | SS df MS Number of obs = 4,137 -------------+---------------------------------- F(4, 4132) = 398.02 Model | 499.030504 4 124.757626 Prob > F = 0.0000 Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781 -------------+---------------------------------- Adj R-squared = 0.2774 Total | 1794.19567 4,136 .433799728 Root MSE = .55986 ------------------------------------------------------------------------------ colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- sat | .0014925 .0000652 22.89 0.000 .0013646 .0016204 hsperc | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559 hsize | -.0608815 .0165012 -3.69 0.000 -.0932328 -.0285302 hsize2 | .0054603 .0022698 2.41 0.016 .0010102 .0099104 _cons | 1.492652 .0753414 19.81 0.000 1.344942 1.640362 ------------------------------------------------------------------------------ . . set obs 4138 number of observations (_N) was 4,137, now 4,138 . replace sat=1200 in 4138/4138 (1 real change made) . replace hsperc=30 in 4138/4138 (1 real change made) . replace hsize=5 in 4138/4138 (1 real change made) . replace hsize2=25 in 4138/4138 (1 real change made) . . *Para se fazer a mesma regressão anterior, agora coa 4138a a mais basta comandar > regress* . regress Source | SS df MS Number of obs = 4,137 -------------+---------------------------------- F(4, 4132) = 398.02 Model | 499.030504 4 124.757626 Prob > F = 0.0000 Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781 -------------+---------------------------------- Adj R-squared = 0.2774 Total | 1794.19567 4,136 .433799728 Root MSE = .55986 ------------------------------------------------------------------------------ colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- sat | .0014925 .0000652 22.89 0.000 .0013646 .0016204 hsperc | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559 hsize | -.0608815 .0165012 -3.69 0.000 -.0932328 -.0285302 hsize2 | .0054603 .0022698 2.41 0.016 .0010102 .0099104 _cons | 1.492652 .0753414 19.81 0.000 1.344942 1.640362 ------------------------------------------------------------------------------ . . /*Agora tomamos a obs 4138 e estimamos o valor previsto do collegegpa para a obs > ervação 4138, colgpahatt, que será igual ateta2hat acima, o seu desvio padrão e os > limites inferior)l) e superior (u) do intervalo de confiança da previsão*/ . . predict sigmahat4138 in 4138/4138,stdp (4,137 missing values generated) . predict colgpahatt in 4138/4138,xb (4,137 missing values generated) . gen lb = colgpahatt-1.96* sigmahat4138 in 4138/4138 (4,137 missing values generated) . gen ub = colgpahatt+1.96* sigmahat4138 in 4138/4138 (4,137 missing values generated) . list sigmahat4138 lb colgpahatt ub in 4138/4138 +-------------------------------------------+ | sig~4138 lb colgpa~t ub | |-------------------------------------------| 4138. | .0198778 2.661115 2.700075 2.739036 | +-------------------------------------------+ . . . **Exemplo 6.6 Previsão para Futuro colGPA** . . /*Esta previsão, é para um aluno extra, que não fazia parte da amostra, e dado os > valores para as variáveis explicativas, que tomarei iguais aqules da observação a > nterior, é como previsão fora da amostra. Neste caso temos o erro amostral dado pe > lo sigmahat + oa variância associada a esta nova observação.*/ . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/gpa2.dta", clear . . . gen hsize2=hsize*hsize . reg colgpa sat hsperc hsize hsize2 Source | SS df MS Number of obs = 4,137 -------------+---------------------------------- F(4, 4132) = 398.02 Model | 499.030504 4 124.757626 Prob > F = 0.0000 Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781 -------------+---------------------------------- Adj R-squared = 0.2774 Total | 1794.19567 4,136 .433799728 Root MSE = .55986 ------------------------------------------------------------------------------ colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- sat | .0014925 .0000652 22.89 0.000 .0013646 .0016204 hsperc | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559 hsize | -.0608815 .0165012 -3.69 0.000 -.0932328 -.0285302 hsize2 | .0054603 .0022698 2.41 0.016 .0010102 .0099104 _cons | 1.492652 .0753414 19.81 0.000 1.344942 1.640362 ------------------------------------------------------------------------------ . . . set obs 4138 number of observations (_N) was 4,137, now 4,138 . replace sat=1200 in 4138/4138 (1 real change made) . replace hsperc=30 in 4138/4138 (1 real change made) . replace hsize=5 in 4138/4138 (1 real change made) . replace hsize2=25 in 4138/4138 (1 real change made) . regress Source | SS df MS Number of obs = 4,137 -------------+---------------------------------- F(4, 4132) = 398.02 Model | 499.030504 4 124.757626 Prob > F = 0.0000 Residual | 1295.16517 4,132 .313447524 R-squared = 0.2781 -------------+---------------------------------- Adj R-squared = 0.2774 Total | 1794.19567 4,136 .433799728 Root MSE = .55986 ------------------------------------------------------------------------------ colgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- sat | .0014925 .0000652 22.89 0.000 .0013646 .0016204 hsperc | -.0138558 .000561 -24.70 0.000 -.0149557 -.0127559 hsize | -.0608815 .0165012 -3.69 0.000 -.0932328 -.0285302 hsize2 | .0054603 .0022698 2.41 0.016 .0010102 .0099104 _cons | 1.492652 .0753414 19.81 0.000 1.344942 1.640362 ------------------------------------------------------------------------------ . . . predict seehatzero in 4138/4138,stdf (4,137 missing values generated) . predict colgpahatt in 4138/4138,xb (4,137 missing values generated) . gen lb1 = colgpahatt-1.96* seehatzero in 4138/4138 (4,137 missing values generated) . gen ub1 = colgpahatt+1.96* seehatzero in 4138/4138 (4,137 missing values generated) . list seehatzero lb1 colgpahatt ub1 in 4138/4138 +-----------------------------------------+ | seehat~o lb1 colgpa~t ub1 | |-----------------------------------------| 4138. | .5602166 1.602051 2.700075 3.7981 | +-----------------------------------------+ . . . . ***Exemplo 6.7: Previsão de Salário CEO*** . . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/CEOSAL2.DTA", clear . . . reg lsalary lsales lmktval ceoten Source | SS df MS Number of obs = 177 -------------+---------------------------------- F(3, 173) = 26.91 Model | 20.5672434 3 6.85574779 Prob > F = 0.0000 Residual | 44.0789697 173 .254791732 R-squared = 0.3182 -------------+---------------------------------- Adj R-squared = 0.3063 Total | 64.6462131 176 .367308029 Root MSE = .50477 ------------------------------------------------------------------------------ lsalary | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lsales | .1628545 .0392421 4.15 0.000 .0853995 .2403094 lmktval | .109243 .0495947 2.20 0.029 .0113545 .2071315 ceoten | .0117054 .0053261 2.20 0.029 .001193 .0222178 _cons | 4.503795 .2572344 17.51 0.000 3.996073 5.011517 ------------------------------------------------------------------------------ . . . predict lsal, xb . gen mhat=exp(lsal) . . *Previsão(Incorreta) de Salário CEO* . /* É incorreta porque não leva em conta que é necessário multiplica a exp(lsalaryh > at) por exp(sigma2/2), que > é um valor positivo e maior que 1-Veja livro e notas de aula)*/ . . display _b[_cons]+_b[lsales]*log(5000)+_b[lmktval]*log(10000)+_b[ceoten]*10 7.0140771 . scalar salhat1=exp(7.0140771) . display salhat1 1112.1797 . . . **Previsão (Correta) de Salário de CEO** . . reg salary mhat, noconstant Source | SS df MS Number of obs = 177 -------------+---------------------------------- F(1, 176) = 562.39 Model | 147352711 1 147352711 Prob > F = 0.0000 Residual | 46113901 176 262010.801 R-squared = 0.7616 -------------+---------------------------------- Adj R-squared = 0.7603 Total | 193466612 177 1093031.71 Root MSE = 511.87 ------------------------------------------------------------------------------ salary | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mhat | 1.116857 .0470953 23.71 0.000 1.023912 1.209801 ------------------------------------------------------------------------------ . . display _b[mhat]*exp(7.013) 1240.8082 . . . **ANALISE DE RESÍDUOS e OUTLYERS** . . use "/Users/denisardalves/Desktop/EAE 324 2017/DATA SET/hprice1.dta", clear . . reg price lotsize sqrft bdrms Source | SS df MS Number of obs = 88 -------------+---------------------------------- F(3, 84) = 57.46 Model | 617130.701 3 205710.234 Prob > F = 0.0000 Residual | 300723.805 84 3580.0453 R-squared = 0.6724 -------------+---------------------------------- Adj R-squared = 0.6607 Total | 917854.506 87 10550.0518 Root MSE = 59.833 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lotsize | .0020677 .0006421 3.22 0.002 .0007908 .0033446 sqrft | .1227782 .0132374 9.28 0.000 .0964541 .1491022 bdrms | 13.85252 9.010145 1.54 0.128 -4.065141 31.77018 _cons | -21.77031 29.47504 -0.74 0.462 -80.38466 36.84405 ------------------------------------------------------------------------------ . predict r1, residual . sort r1 . list price r1 +---------------------+ | price r1 | |---------------------| 1. | 111 -120.0264 | 2. | 318 -115.5087 | 3. | 150 -107.0809 | 4. | 417.5 -91.24398 | 5. | 300 -85.46117 | |---------------------| 6. | 335 -77.17268 | 7. | 180 -74.70272 | 8. | 209.001 -65.50285 | 9. | 325 -63.69911 | 10. | 244 -62.56659 | |---------------------| 11. | 190 -59.84522 | 12. | 405 -54.46616 | 13. | 212 -54.30042 | 14. | 265 -52.1298 | 15. | 240 -51.44111 | |---------------------| 16. | 313 -48.70498 | 17. | 477.5 -48.3503 | 18. | 285 -47.85586 | 19. | 300 -45.63977 | 20. | 206 -43.14255 | |---------------------| 21. | 209 -41.74962 | 22. | 230 -40.86902 | 23. | 235 -37.74981 | 24. | 215 -36.66378 | 25. | 246 -36.64657 | |---------------------| 26. | 360 -33.80125 | 27. | 227.4 -29.76693 | 28. | 270 -26.69623 | 29. | 225 -24.27153 | 30. | 202.5 -23.65145 | |---------------------| 31. | 242 -19.68343 | 32. | 240 -17.81783 | 33. | 316 -16.76209 | 34. | 266 -16.59696 | 35. | 258 -16.27121 | |---------------------| 36. | 250 -13.8158 | 37. | 306 -13.46216 | 38. | 195 -12.08152 | 39. | 264 -11.62921 | 40. | 230 -11.31267 | |---------------------| 41. | 191 -8.236558 | 42. | 220 -7.662788 | 43. | 279.5 -6.752801 | 44. | 230 -6.707262 | 45. | 236 -6.402439 | |---------------------| 46. | 268.125 -5.446904 | 47. | 248 -3.537785 | 48. | 225 -2.824941 | 49. | 380 -2.745209 | 50. | 335 -.195089 | |---------------------| 51. | 250 1.399296 | 52. | 275 5.363331 | 53. | 210 6.70064 | 54. | 225 7.386314 | 55. | 246 9.099899 | |---------------------| 56. | 265 12.43361 | 57. | 266 16.71802 | 58. | 373 18.09319 | 59. | 575 18.80182 | 60. | 251 19.16811 | |---------------------| 61. | 240 19.21921 | 62. | 289 20.33443 | 63. | 330 24.90993 | 64. | 225 26.23623 | 65. | 247 30.92402 | |---------------------| 66. | 375 32.25395 | 67. | 253 32.52937 | 68. | 285 32.67597 | 69. | 310 33.27584 | 70. | 255 36.03143 | |---------------------| 71. | 295 37.14719 | 72. | 332.5 40.32087 | 73. | 350 44.33447 | 74. | 343 46.90717 | 75. | 219 54.41837 | |---------------------| 76. | 361 55.09113 | 77. | 230 55.47031 | 78. | 466.275 62.9396 | 79. | 315 66.47863 | 80. | 310 67.42652 | |---------------------| 81. | 268 67.60396 | 82. | 471.25 69.70712 | 83. | 315 69.84325 | 84. | 370 74.84873 | 85. | 495 112.7292 | |---------------------| 86. | 725 163.7951 | 87. | 713.5 198.6601 | 88. | 425 209.3758 | +---------------------+ . . **TESTE DE NORMALIDADE DOS RESÍDUOS** . . kdensity r1, normal . graph save Graph "/Users/denisardalves/Desktop/EAE 324 2017/Grapico- k-density res > idual apestudo.gph", replace (file /Users/denisardalves/Desktop/EAE 324 2017/Grapico- k-density residual apestudo > .gph saved) . **Teste de Normalidade do Stata** . . /* Um teste de normalidade disponível no Stata é o > Shapiro-Wilk. A hipótese nula para o qual o p-value é > calculado é de que a distribuição dos dados é normal. > Portanto, p-value alto rejeita-se H0. A distribuição > do teste é Chi2, sob H0. No nosso exemplo, o p-value é .00059 o que não > permite a rejeição de H0, logo não se rejeita a hipótese > de normalidade > dos resíduos o que nos leva a duvidar dos resultados dos testes de t e F.*/ . . * Comando para o teste de normalidade dos resíduos:* . . swilk r1 Shapiro-Wilk W test for normal data Variable | Obs W V z Prob>z -------------+------------------------------------------------------ r1 | 88 0.94132 4.357 3.242 0.00059 . . . . end of do-file . exit, clear