As I do not have any econometric background I find it difficult to understand which one to use when outcomes differ a lot. My dependent (female labor-force participation status), independent (has more than three children) and instrumental (combination of gender of first two children) variables are all binary. OLS, logit, and probit models all have shown similar results (as mentioned in "Mostly harmless econometric" when it comes to average marginal effect there is no difference between these models). But when I run ivreg (2SLS), ivprobit, and biprobit results are totally different. Previous literature just simply use OLS and 2SLS, but I can not find convincing solid reason behind it. (I checked "Mostly harmless econometrics" by Angrist and "Econometric analysis of cross section and panel data" 2nd edition by Wooldridge where explanations were quite vague) What are the trade-off between these models, and how to interpret difference in estimations and how to choose right one?
↧
ivreg, ivprobit and biprobit which one to use? (any theoretical reasoning?)
↧
Betareg
Hi all,
I have a propotion dependent variable that’s greater than zero and less then one. The mean is 0.1240891 and SD 0.1363 and its positivly skewed. All the independent variables are dummy variables. I have read that betareg is most appropriate model for propotion data.
Want to check did I use the right model and also do I need to check any assumptions before carrying betareg ?
Betareg dep i.var1 i.var2 i.var3 i.var4
margins dep
After running estat ic to check the model fit AIC and BIC are approx -4400
Thank you in advance.
I have a propotion dependent variable that’s greater than zero and less then one. The mean is 0.1240891 and SD 0.1363 and its positivly skewed. All the independent variables are dummy variables. I have read that betareg is most appropriate model for propotion data.
Want to check did I use the right model and also do I need to check any assumptions before carrying betareg ?
Betareg dep i.var1 i.var2 i.var3 i.var4
margins dep
After running estat ic to check the model fit AIC and BIC are approx -4400
Thank you in advance.
↧
↧
parsing with variant length strings; overcoming ustrregexra greediness
I'm currently trying to parse text from string variables like this:
On a former post of mine for a similar issue, William Lisowski recommended using -ustrregexra- before -split- to parse using length-variant substrings that were book-ended by like-symbols. The only problem using that strategy here is that -ustrregextra- is "greedy", meaning that if I want to use "<" and ">" to create new symbols to parse with, the entire string will be replaced. To illustrate,
using:
I want:
but I get:
Can anyone recommend a solution to the greedy problem, or perhaps another strategy entirely?
Thank you!
-Reese
v.14.2
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str1964 FieldLabel `"<div style="background:tomato; padding:5px; margin:5px;"><font color="black"> <font size="5">Verify alpha codes are equal, values are not the same </font></strong></div>"' `"<div style="background:lightgrey; padding:5px; margin:5px;"><font color="black"><font size="5">Please answer the following questions. </font>"' end
using:
Code:
g newtext = ustrregexra(FieldLabel,"<.*>","!!split!!")
Code:
!!split!!!!split!!!!split!!Please verify PID codes are equal, values are not the same !!split!!!!split!!!!split!!
Code:
!!split!!
Thank you!
-Reese
v.14.2
↧
LSDV, problem with dummy variables and significance
Hello,
my data: panel data, 260 observations; 20 Regions and time period: 2004 to 2016.
I have found endogeneity between corruption and GDP growth and I have chosen FE over RE because of the results of the Hausman test. Also, I chose LSDV over FE-2SLS because I would like to see the dummy coefficients.
Hence, I am running a LSDV model where I want to see the effect that the Corruption level (coded as Cor) of each Region(coded as countrynum) has on Y(which is GDP growth rate).
I do not understand why the model is significant with i.countrynum, but it becomes insignificant when I include i.Year. Also, when using i.countrynum, why would my variable of corruption become insignificant when I use log(corruption) instead of Corruption?
Also, I have run testparm on i.countrynum and i.year and the p values are close to 0.
my data: panel data, 260 observations; 20 Regions and time period: 2004 to 2016.
I have found endogeneity between corruption and GDP growth and I have chosen FE over RE because of the results of the Hausman test. Also, I chose LSDV over FE-2SLS because I would like to see the dummy coefficients.
Hence, I am running a LSDV model where I want to see the effect that the Corruption level (coded as Cor) of each Region(coded as countrynum) has on Y(which is GDP growth rate).
I do not understand why the model is significant with i.countrynum, but it becomes insignificant when I include i.Year. Also, when using i.countrynum, why would my variable of corruption become insignificant when I use log(corruption) instead of Corruption?
Also, I have run testparm on i.countrynum and i.year and the p values are close to 0.
Code:
. reg Y I logYlevel_1 n H Cor i.countrynum, robust Linear regression Number of obs = 240 F(24, 215) = 4.04 Prob > F = 0.0000 R-squared = 0.2233 Root MSE = .02424 ------------------------------------------------------------------------------ | Robust Y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- I | .0421328 .0235268 1.79 0.075 -.00424 .0885055 logYlevel_1 | -.2725869 .0477639 -5.71 0.000 -.3667323 -.1784415 n | -.0011075 .0002068 -5.36 0.000 -.0015151 -.0007 H | -.5249337 .1279176 -4.10 0.000 -.777067 -.2728005 Cor | 152.1827 43.705 3.48 0.001 66.03756 238.3279 | countrynum | 2 | -.0568746 .018295 -3.11 0.002 -.092935 -.0208142 3 | -.1177966 .0196517 -5.99 0.000 -.1565313 -.079062 4 | -.132077 .0269762 -4.90 0.000 -.1852487 -.0789053 5 | .0762482 .0194006 3.93 0.000 .0380084 .1144879 6 | .0528407 .0149889 3.53 0.001 .0232966 .0823847 7 | .0857303 .0238629 3.59 0.000 .0386951 .1327655 8 | .0685547 .0154954 4.42 0.000 .0380124 .099097 9 | .0485808 .0342438 1.42 0.157 -.0189157 .1160773 10 | .0277995 .01135 2.45 0.015 .0054279 .0501711 11 | -.0446546 .0128474 -3.48 0.001 -.0699775 -.0193316 12 | .0164898 .0176551 0.93 0.351 -.0183094 .051289 13 | -.129616 .0228253 -5.68 0.000 -.1746061 -.084626 14 | -.0697799 .0127563 -5.47 0.000 -.0949234 -.0446365 15 | -.1306724 .0235665 -5.54 0.000 -.1771235 -.0842214 16 | .0428457 .0137236 3.12 0.002 .0157956 .0698958 17 | .1167549 .021726 5.37 0.000 .0739316 .1595782 18 | .0173388 .0115789 1.50 0.136 -.0054839 .0401616 19 | .0970503 .0240081 4.04 0.000 .0497288 .1443717 20 | .0306902 .0163222 1.88 0.061 -.0014819 .0628623 | _cons | 2.812614 .4884072 5.76 0.000 1.849934 3.775293 ------------------------------------------------------------------------------
Code:
. reg Y I logYlevel_1 n H logCor i.countrynum,robust Linear regression Number of obs = 225 F(24, 200) = 4.83 Prob > F = 0.0000 R-squared = 0.2391 Root MSE = .0232 ------------------------------------------------------------------------------ | Robust Y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- I | .0438102 .023208 1.89 0.061 -.0019535 .0895738 logYlevel_1 | -.2952572 .0474944 -6.22 0.000 -.3889112 -.2016031 n | -.0010141 .0001769 -5.73 0.000 -.001363 -.0006652 H | -.5967442 .1222761 -4.88 0.000 -.83786 -.3556284 logCor | -.0001871 .002452 -0.08 0.939 -.0050222 .004648 | countrynum | 2 | -.0776557 .0163377 -4.75 0.000 -.109872 -.0454395 3 | -.1267072 .0195275 -6.49 0.000 -.1652135 -.0882009 4 | -.1387356 .0270324 -5.13 0.000 -.1920407 -.0854305 5 | .0833008 .0189025 4.41 0.000 .046027 .1205746 6 | .0572389 .0149254 3.83 0.000 .0278075 .0866703 7 | .0958031 .0238787 4.01 0.000 .0487167 .1428895 8 | .0765805 .0158141 4.84 0.000 .0453967 .1077642 9 | .0565331 .0340127 1.66 0.098 -.0105365 .1236026 10 | .030444 .011336 2.69 0.008 .0080905 .0527975 11 | -.0304818 .0129706 -2.35 0.020 -.0560585 -.0049051 12 | .0196087 .0175451 1.12 0.265 -.0149885 .0542058 13 | -.139783 .0227806 -6.14 0.000 -.1847039 -.0948621 14 | -.0761153 .0127299 -5.98 0.000 -.1012175 -.0510132 15 | -.1402884 .0235798 -5.95 0.000 -.1867854 -.0937915 16 | .0473936 .0139128 3.41 0.001 .019959 .0748281 17 | .1254732 .021914 5.73 0.000 .082261 .1686854 18 | .0189846 .0122194 1.55 0.122 -.0051108 .0430799 19 | .1095675 .0269841 4.06 0.000 .0563576 .1627774 20 | .0338714 .0162419 2.09 0.038 .0018441 .0658987 | _cons | 3.049778 .4834469 6.31 0.000 2.096471 4.003085 ------------------------------------------------------------------------------
Code:
. reg Y I logYlevel_1 n H Cor i.countrynum i.Year, robust Linear regression Number of obs = 240 F(35, 204) = 19.07 Prob > F = 0.0000 R-squared = 0.7447 Root MSE = .01427 ------------------------------------------------------------------------------ | Robust Y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- I | -.0017374 .0152018 -0.11 0.909 -.0317101 .0282353 logYlevel_1 | -.1344197 .043968 -3.06 0.003 -.2211097 -.0477297 n | -.0000933 .0003057 -0.31 0.761 -.0006961 .0005095 H | .1089849 .1461164 0.75 0.457 -.179107 .3970768 Cor | 42.22968 27.68753 1.53 0.129 -12.36074 96.82009 | countrynum | 2 | -.0151863 .0160518 -0.95 0.345 -.046835 .0164625 3 | -.0482293 .0157838 -3.06 0.003 -.0793496 -.017109 4 | -.0381451 .0174077 -2.19 0.030 -.0724672 -.003823 5 | .0448873 .0187381 2.40 0.018 .0079421 .0818325 6 | .0300398 .0107687 2.79 0.006 .0088075 .0512721 7 | .0336804 .0214411 1.57 0.118 -.0085941 .075955 8 | .0259668 .011634 2.23 0.027 .0030284 .0489052 9 | .0567836 .0316731 1.79 0.074 -.005665 .1192323 10 | .0101893 .0064318 1.58 0.115 -.0024919 .0228706 11 | -.0263025 .0101509 -2.59 0.010 -.0463165 -.0062884 12 | .0268867 .0158145 1.70 0.091 -.0042941 .0580675 13 | -.0394516 .0159554 -2.47 0.014 -.0709102 -.007993 14 | -.0212305 .0097003 -2.19 0.030 -.0403562 -.0021048 15 | -.0406477 .0160236 -2.54 0.012 -.0722408 -.0090546 16 | .026545 .0117709 2.26 0.025 .0033367 .0497533 17 | .0621137 .0196756 3.16 0.002 .02332 .1009073 18 | -.0060011 .0064424 -0.93 0.353 -.0187033 .006701 19 | .049247 .0188249 2.62 0.010 .0121307 .0863634 20 | .0361759 .0169272 2.14 0.034 .0028012 .0695507 | Year | 2006 | .0145628 .0034744 4.19 0.000 .0077125 .0214131 2007 | .0062804 .0040101 1.57 0.119 -.0016263 .014187 2008 | -.0235486 .005161 -4.56 0.000 -.0337243 -.013373 2009 | -.0631985 .0054573 -11.58 0.000 -.0739584 -.0524386 2010 | -.0056302 .0071231 -0.79 0.430 -.0196746 .0084142 2011 | -.0116207 .0053297 -2.18 0.030 -.022129 -.0011123 2012 | -.0420052 .006943 -6.05 0.000 -.0556944 -.0283159 2013 | -.0412766 .0104555 -3.95 0.000 -.0618912 -.0206619 2014 | -.0257527 .0091024 -2.83 0.005 -.0436995 -.0078059 2015 | -.0074938 .0120268 -0.62 0.534 -.0312067 .0162191 2016 | -.0120389 .0093528 -1.29 0.199 -.0304795 .0064017 | _cons | 1.349846 .4464339 3.02 0.003 .4696303 2.230063 ------------------------------------------------------------------------------
Code:
reg Y I logYlevel_1 n H Cor i.Year, robust Linear regression Number of obs = 240 F(16, 223) = 36.46 Prob > F = 0.0000 R-squared = 0.7030 Root MSE = .01472 ------------------------------------------------------------------------------ | Robust Y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- I | .0002541 .0022204 0.11 0.909 -.0041216 .0046298 logYlevel_1 | .0047408 .0051855 0.91 0.362 -.0054781 .0149596 n | -.0002482 .0002666 -0.93 0.353 -.0007736 .0002772 H | -.0516428 .0513114 -1.01 0.315 -.1527602 .0494745 Cor | .9710216 20.83673 0.05 0.963 -40.09107 42.03312 | Year | 2006 | .0150445 .0029372 5.12 0.000 .0092562 .0208328 2007 | .0066831 .0030496 2.19 0.029 .0006734 .0126929 2008 | -.0241049 .0040395 -5.97 0.000 -.0320654 -.0161444 2009 | -.0608203 .0044387 -13.70 0.000 -.0695674 -.0520731 2010 | .0056591 .0052307 1.08 0.280 -.0046489 .0159672 2011 | -.0011653 .0038145 -0.31 0.760 -.0086823 .0063518 2012 | -.0292677 .0043704 -6.70 0.000 -.0378802 -.0206552 2013 | -.0219385 .0081194 -2.70 0.007 -.037939 -.005938 2014 | -.0047909 .0041872 -1.14 0.254 -.0130425 .0034608 2015 | .0144844 .0063197 2.29 0.023 .0020304 .0269385 2016 | .0083592 .0044414 1.88 0.061 -.0003933 .0171117 | _cons | -.0385648 .0492351 -0.78 0.434 -.1355904 .0584608 ------------------------------------------------------------------------------
↧
Storing results with a large loop (>11000)
I am trying to store results across a large loop. I have previously done as follows but am limited to the matrix size (11000).
matrix z=J[1,6,.]
matrix Catch=J[`n',1]
forval i =1/`n' {
quietly reg y x`i'
quietly estat ic
matrix z=r(S)
matrix Catch(`i',1)=z[1,5]
}
Where `n' is a number significantly greater than >11000.
Is there a way to store these values elsewhere or output them quickly? I know I can use the "putexcel" command but it is slow having to call the command in each loop. Are there other options?
matrix z=J[1,6,.]
matrix Catch=J[`n',1]
forval i =1/`n' {
quietly reg y x`i'
quietly estat ic
matrix z=r(S)
matrix Catch(`i',1)=z[1,5]
}
Where `n' is a number significantly greater than >11000.
Is there a way to store these values elsewhere or output them quickly? I know I can use the "putexcel" command but it is slow having to call the command in each loop. Are there other options?
↧
↧
Collapse (mean) and data ordering
Hi statalist,
I'm working with a data set that resembles the following:
ID Income
1 50
1 40
1 20
2 10
2 40
2 50
3 60
3 20
3 10
I used collapse (mean) Income, by (ID)
now after collapsing the data is appearing in the following form
ID Income
2 Mean(2)
3 Mean(3)
1 Mean (1)
I need the output in the same order as before, i.e, in the form of
ID Income
1 Mean(1)
2 Mean(2)
3 Mean(3)
what should I do to obtain the means in the same order as the original data?
Thanks.
I'm working with a data set that resembles the following:
ID Income
1 50
1 40
1 20
2 10
2 40
2 50
3 60
3 20
3 10
I used collapse (mean) Income, by (ID)
now after collapsing the data is appearing in the following form
ID Income
2 Mean(2)
3 Mean(3)
1 Mean (1)
I need the output in the same order as before, i.e, in the form of
ID Income
1 Mean(1)
2 Mean(2)
3 Mean(3)
what should I do to obtain the means in the same order as the original data?
Thanks.
↧
How to avoid losing variable labels when using collapse command
I have smallholder agriculture plot level commercialization data which i want to collapse to household level data. To avoid losing variable labels i resorted to the foreach loop method below by Cox;
Copy variable labels before collapse . foreach v of var * { . local l`v' : variable label `v' . if `"`l`v''"' == "" { . local l`v' "`v'" . } . } Attach the saved labels after collapse . foreach v of var * { . label var `v' "`l`v''" . }
Whenever i try running this codes, i am getting the error message:
"foreach command may not result from a macro expansion interactively or in do files"
What could be the reason? Thank you in advance
Takesure Tozooneyi
Copy variable labels before collapse . foreach v of var * { . local l`v' : variable label `v' . if `"`l`v''"' == "" { . local l`v' "`v'" . } . } Attach the saved labels after collapse . foreach v of var * { . label var `v' "`l`v''" . }
Whenever i try running this codes, i am getting the error message:
"foreach command may not result from a macro expansion interactively or in do files"
What could be the reason? Thank you in advance
Takesure Tozooneyi
↧
Drawing a line through the outer XY combinations below the trend line
The eventual goal is to assign minimum value of actl for a new exp value (where actl is missing). I think I should first come over the issue of drawing a line passing through these observations to use for a decision rule, but not really sure how to attack this problem.
Thank you in advance
enter
scatter actl exp
to see the scatter plot. Visually I am trying to get a line passing through the lowest XY points to assign the minimum value of Actl to a new Exp var that does not have a known Actl value yet.
Actl is always discrete between 0 and 1 in multiples of 0.1 and Exp is continuous between 0-1 domain.
Thank you in advance
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(actl exp) byte minactl . .4306503 . .6 .43129745 . .4 .4318457 . .4 .432377 . .4 .4326184 . .4 .4332968 . .3 .4353066 . .3 .4364009 . .4 .4389387 . . .4391871 . . .4402585 . .4 .4403833 . . .44135585 . . .4416456 . . .4417434 . . .4419167 . . .4438399 . .4 .4453161 . . .4481203 . . .4515771 . .4 .4517835 . . .4522247 . .4 .4527117 . .3 .4529815 . .3 .4544157 . .5 .4677871 . .4 .4767839 . .6 .4869735 . .6 .4926984 . .4 .51891696 . .3 .52825445 . .4 .53248686 . .4 .5328437 . . .5332037 . .6 .5334881 . .6 .5335155 . .5 .5338645 . .5 .53483564 . .6 .53746325 . .5 .5404169 . . .5407301 . . .54236794 . .4 .5438738 . .5 .5484574 . .5 .55163646 . .4 .55418557 . . .5592771 . . .56299645 . .8 .5669411 . . .5672508 . . .5673133 . . .56743836 . . .56973493 . . .5727739 . .8 .57350165 . .6 .57357556 . . .573942 . . .5740153 . . .57531005 . . .5799186 . .6 .5843775 . .5 .58538944 . .6 .58968616 . .5 .59198374 . . .59305274 . .4 .5949966 . .4 .5956155 . .7 .5968927 . .8 .5977873 . .7 .5997788 . . .6584147 . 1 .6827189 . 1 .6881836 . . .6882014 . . .6898127 . 1 .6934553 . . .7054474 . .7 .707083 . . .7075651 . . .7090087 . .9 .7098652 . . .7100893 . . .7105488 . . .7108207 . . .7110068 . . .7116353 . . .7138406 . .6 .7139603 . . .7140106 . . .71864 . . .7186521 . .4 .7188953 . .5 .7188953 . . .7204999 . .7 .720687 . . .7210785 . . .7215325 . .7 .7218328 . . .7218585 . . .728121 . end
scatter actl exp
to see the scatter plot. Visually I am trying to get a line passing through the lowest XY points to assign the minimum value of Actl to a new Exp var that does not have a known Actl value yet.
Actl is always discrete between 0 and 1 in multiples of 0.1 and Exp is continuous between 0-1 domain.
↧
Identifying recession points
Hello,
I have a dummy variable with 1 and -1: 1 corresponds to a peak, and -1 - to a trough. I want to replace the points between 1 and -1 (i.e., the recession) with 1, and all the other points with 0. That is, I'm interested only in the points between a peak (1) and the next trough (-1). Here's an example of the data:
I would appreciate any help.
I have a dummy variable with 1 and -1: 1 corresponds to a peak, and -1 - to a trough. I want to replace the points between 1 and -1 (i.e., the recession) with 1, and all the other points with 0. That is, I'm interested only in the points between a peak (1) and the next trough (-1). Here's an example of the data:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(per_q logGDP_point) 120 0 121 0 122 0 123 0 124 0 125 0 126 0 127 0 128 0 129 0 130 -1 131 0 132 0 133 0 134 0 135 0 136 0 137 1 138 0 139 -1 140 0 141 0 142 0 143 0 144 0 145 0 146 1 147 0 148 0 149 0 150 0 151 0 152 0 153 0 154 0 155 0 156 0 157 -1 158 0 159 0 160 0 161 0 162 0 163 0 164 0 165 0 166 0 167 0 168 0 169 0 170 0 171 0 172 0 173 0 174 0 175 0 176 0 177 0 178 0 179 0 180 0 181 0 182 0 183 0 184 0 185 0 186 0 187 0 188 0 189 0 190 0 191 0 192 0 193 0 194 1 195 0 196 0 197 0 198 0 199 0 200 -1 201 0 202 0 203 0 204 0 205 0 206 0 207 0 208 0 209 0 210 0 211 0 212 0 213 0 214 0 215 0 216 0 217 0 218 0 219 0 end
↧
↧
Command to obtain effect sizes.
Hello,
I am running two models on a panel data of 50 firms over 15 years (one using the xtnbreg and the other using xtreg). I tried to obtain the effect sizes by using the esize, esizei, and estat esize. However, all of them returned with error messages. Would you please help me to identify a command that will produce the effect sizes?
Many Thanks.
I am running two models on a panel data of 50 firms over 15 years (one using the xtnbreg and the other using xtreg). I tried to obtain the effect sizes by using the esize, esizei, and estat esize. However, all of them returned with error messages. Would you please help me to identify a command that will produce the effect sizes?
Many Thanks.
↧
Regular expression help
Hello,
Apologies for making a second regular expressions-related post today, but the topic is new to me and I've struggled to answer my current problem with the info online.
I have the following string values, say, for variable Branching:
and I want to convert the (1 or 2-digit) number in the parentheses that may be followed by " ==" or "==" to two underscores and the number without parentheses :
(notice last value unchanged)
Any help?
Thanks a lot,
Reese
v 14.2
Apologies for making a second regular expressions-related post today, but the topic is new to me and I've struggled to answer my current problem with the info online.
I have the following string values, say, for variable Branching:
Code:
st12a(5)==1 & st13b == 3 st8a(88) == 1 (e1 == 1 | e1 == 2) & e2==0
Code:
st12a__5==1 & st13b == 3 st8a__88 == 1 (e1 == 1 | e1 == 2) & e2==0
Any help?
Thanks a lot,
Reese
v 14.2
↧
Comparing each observation from one variable to few hundred thousand in another
Hi all,
I am wondering if anyone has any ideas on how I can compare each observation from one variable to every observation in another variable. Essentially, imagine you have 10 observations in one variable A, and then another 1000 in another variable B. For each observation in variable A, I would like to compare it to every value in variable B. In reality, I have over 300,000 observations for each variable, so the computation becomes cumbersome quickly.
I have currently figured out the problem in Python, but it takes over 2 minutes to run through 300 observations (or roughly 10ish hours for the whole dataset). The algorithm is straightforward enough in Python - fix variable 1, compare to every observation in variable 2; fix variable 2, compare to every observation in variable 2, etc., etc... Is there anything in Stata a bit more sophisticated?
I am currently using StataIC 15 on MacOS.
I am wondering if anyone has any ideas on how I can compare each observation from one variable to every observation in another variable. Essentially, imagine you have 10 observations in one variable A, and then another 1000 in another variable B. For each observation in variable A, I would like to compare it to every value in variable B. In reality, I have over 300,000 observations for each variable, so the computation becomes cumbersome quickly.
I have currently figured out the problem in Python, but it takes over 2 minutes to run through 300 observations (or roughly 10ish hours for the whole dataset). The algorithm is straightforward enough in Python - fix variable 1, compare to every observation in variable 2; fix variable 2, compare to every observation in variable 2, etc., etc... Is there anything in Stata a bit more sophisticated?
I am currently using StataIC 15 on MacOS.
↧
Calculation of directly standardized rates "distrate" or "dstdize" commands
Hello,
First of all I dont know if this is the proper site to publish this, correct me if its not.
Need some help:
I'd used the "distrate" command on a database "DATA.dta" before.
Done the cleaning on the database and then I modified the "pop.dta" file in which i had the persons-year since the catchment area was modified.
The original code was:
the results were good expressed this way
Code:
| year sex cases N crude rateadj lb_gam ub_gam se_gam |
Did it twice with 2 different "pop.dta" and "pop2.dta" since i needed it with both.
The code was saved on the do-file
Now the issue im having is that i try to run the code and it says that
Need help: tried using old data base, tried making a new pop dta file for the person-years and nothing.
thanks for your help.
First of all I dont know if this is the proper site to publish this, correct me if its not.
Need some help:
I'd used the "distrate" command on a database "DATA.dta" before.
Done the cleaning on the database and then I modified the "pop.dta" file in which i had the persons-year since the catchment area was modified.
The original code was:
Code:
distrate cases pop using pop.dta, standstrata(age_grp) popstand(pop) by(year sex) format(%8.1f) mult(100000)
Code:
| year sex cases N crude rateadj lb_gam ub_gam se_gam |
Did it twice with 2 different "pop.dta" and "pop2.dta" since i needed it with both.
The code was saved on the do-file
Now the issue im having is that i try to run the code and it says that
Code:
varibale pop not found r(111);
thanks for your help.
↧
↧
Analyzing length of stay in clustered data (back transform)
Hello,
I'm a relatively new Stata user and am working on a project. I'm looking at length of stay (days) which is heavily right skewed and analgesic usage (days), the data is clustered within hospitals so I'm using fixed effects modeling.
I think this is a fair approach, however the interpretation is not as intuitive as being able to say that change in a drugs usage increases LOS by X days. I'm curious about potentially using Duan's smearing to retransform the coefficient as a potential way to make the interpretation more audience friendly.
One of the issues is the clustering within hospitals, which to me adds a layer of complexity.
Another thought was to leave LOS untransformed and run a median (quantile) regression with clustered bootstrapping.
This seems like a reasonable approach to being able to interpret the results as a change in drug use results in a median increase in LOS by X. However, I have not used the boostrap clustered command in Stata and want to verify that it is accurately accounting for patients being clustered within hospitals.
Hopefully I've given any readers enough information. Any suggestions/advice is welcome.
Thank you,
I'm a relatively new Stata user and am working on a project. I'm looking at length of stay (days) which is heavily right skewed and analgesic usage (days), the data is clustered within hospitals so I'm using fixed effects modeling.
Code:
xtmixed log_LOS ty_iv_usage opioid_usage keto_usage age_year sex ib1.race ib2.ethnicity ib0.insurance open perf year ib3.region || hospital_number :, mle variance nostderr
One of the issues is the clustering within hospitals, which to me adds a layer of complexity.
Another thought was to leave LOS untransformed and run a median (quantile) regression with clustered bootstrapping.
Code:
bootstrap, cluster(hospital_number) reps(100) seed(5) : qreg length_of_stay post_ty_iv_usage post_opioid_usage post_keto_usage age_year sex ib1.race ib2.ethnicity ib0.insurance open perf year ib3.region, quantile(.5)
Hopefully I've given any readers enough information. Any suggestions/advice is welcome.
Thank you,
↧
Predicted probabilities with mimrgns after xtgee
Hello,
This question refers to Daniel Klein's excellent mimrgns program. I am attempting to obtain predicted probabilities from a GEE model with a logit link / binomial family. I would use a regular logit, but there is some clustering in the sample. The data is multiply imputed. I have read Daniel's useful helpfile, but continue to get an error. This is the code I am attempting to run.
I get an error that says:
I am able to run this without the "predict(pr)" and get the linear predictions, as is the default. However, because the model is nonlinear, I'd like to be able to see the predicted probabilities. I would appreciate any advice!
Robbie Dembo
This question refers to Daniel Klein's excellent mimrgns program. I am attempting to obtain predicted probabilities from a GEE model with a logit link / binomial family. I would use a regular logit, but there is some clustering in the sample. The data is multiply imputed. I have read Daniel's useful helpfile, but continue to get an error. This is the code I am attempting to run.
Code:
mi est, eform: xtgee y i.x1##c.x2##c.x3, family(binomial) link(logit) corr(exch) mimrgns, dydx(x2) at(x1=(1) x3=(-5(.5)1.5) predict(pr) post
Code:
option pr not allowed an error occurred when mi estimate executed mimrgns_estimate on m=1
I am able to run this without the "predict(pr)" and get the linear predictions, as is the default. However, because the model is nonlinear, I'd like to be able to see the predicted probabilities. I would appreciate any advice!
Robbie Dembo
↧
Hep with spmap option polygon
Hi, Im working in a map of the metro area of Medellín, Colombia. So far good, but when I try to impose a polygon of the constructed area of the metro area, the maps dont overlay each other
Here is an example of the two maps separated:
Array
Array
Further investigation showed me that the shapefiles are in different projections the first is in "4170:MAGNA-SIRGAS" and the second is in "EPSG:Datum: D_WGS_1984. Sistema de coordenadas: WGS_1984_UTM_Zone_18N. Proyeccion: Transverse_Mercator", so the database of the coordinates for each one, looks very different.
I was wondering if there is a way or method to change the projections of this two shapefiles so I can overlap the second to the first.
Thanks for your help.
Here is an example of the two maps separated:
Array
Array
Further investigation showed me that the shapefiles are in different projections the first is in "4170:MAGNA-SIRGAS" and the second is in "EPSG:Datum: D_WGS_1984. Sistema de coordenadas: WGS_1984_UTM_Zone_18N. Proyeccion: Transverse_Mercator", so the database of the coordinates for each one, looks very different.
I was wondering if there is a way or method to change the projections of this two shapefiles so I can overlap the second to the first.
Thanks for your help.
↧
Help with scheme entries affecting rarea plots using marginsplot
I am trying to create a scheme that changes the intensity, opacity, or color of a marginsplot when the confidence intervals (CIs) are recast as a rarea plot. In the first plot created by the code below you can see the outline of the CIs recast as an area. In the second I changed the opacity so they are not visible.
Can someone point me to the scheme entries that control the line color or line width of rarea plots so I can make this behavior the default in my marginsplots?
Graph G1:
Array
Graph G2:
Array
Best,
Alan
Can someone point me to the scheme entries that control the line color or line width of rarea plots so I can make this behavior the default in my marginsplots?
Code:
sysuse auto regress mpg weight margins, at(weight=(1800(25)4825)) marginsplot, recast(line) recastci(rarea) ciopts(fcolor(*.5)) name(g1) marginsplot, recast(line) recastci(rarea) ciopts(fcolor(*.5) lcolor(%0)) name(g2)
Graph G1:
Array
Graph G2:
Array
Best,
Alan
↧
↧
Calculations of quarterly stock holding data by shareholder
Dear Statalists,
I am dealing with an unbalanced panel data for calculating the share holdings by a particular investing company. As shown from the table below, I would like to calculate the quarterly holding changes by investing company (id=1001) for each portfolio firm (e.g., a, b, c, d, e), and then use the difference times the stock price of each portfolio firm in the incumbent quarter.
For example, for portfolio firm a, firstly, I need to calculate the difference in holding shares by investing firm 1001 between the first and second quarter in 1996, which is 1100-1000=100, secondly, I will use such difference times the stock price of portfolio firm a in the second quarter of 1996, which is 100*4.44. This is just for one firm, while the same process should be performed for other portfolio firms. Similarly, for portfolio firm b, the equation should be (1100-1200)*4.4...
In general, the calculation equation could be:
(Nj, i, t - Nj, i, t-1)* Pj,t, where Nj,i,t is the number of shares holding by investing firm i in portfolio firm j in quarter t, Pj, t is the stock price for portfolio firm j in quarter t.
It would be much appreciated if some one can show me the stata codes in dealing with such issue.
Best,
Cong
I am dealing with an unbalanced panel data for calculating the share holdings by a particular investing company. As shown from the table below, I would like to calculate the quarterly holding changes by investing company (id=1001) for each portfolio firm (e.g., a, b, c, d, e), and then use the difference times the stock price of each portfolio firm in the incumbent quarter.
year | quarter | investment company | portfolio firm | number of shares | stock price of each holding firm |
1996 | 1 | 1001 | a | 1000 | 3.33 |
1996 | 1 | 1001 | b | 1200 | 3.33 |
1996 | 1 | 1001 | c | 1300 | 3.33 |
1996 | 1 | 1001 | d | 1100 | 3.33 |
1996 | 1 | 1001 | e | 1050 | 3.33 |
1996 | 2 | 1001 | a | 1100 | 4.44 |
1996 | 2 | 1001 | b | 1100 | 4.44 |
1996 | 2 | 1001 | c | 1200 | 4.44 |
1996 | 2 | 1001 | d | 1400 | 4.44 |
1996 | 2 | 1001 | e | 0 | 4.44 |
1996 | 3 | 1001 | a | 900 | 5.55 |
1996 | 3 | 1001 | b | 1300 | 5.55 |
1996 | 3 | 1001 | c | 1200 | 5.55 |
1996 | 3 | 1001 | d | 1400 | 5.55 |
1996 | 4 | 1001 | a | 1200 | 6.66 |
1996 | 4 | 1001 | b | 1030 | 6.66 |
1996 | 4 | 1001 | c | 1000 | 6.66 |
1996 | 4 | 1001 | d | 1409 | 6.66 |
1997 | 1 | 1001 | a | 2000 | 7.77 |
1997 | 1 | 1001 | b | 1700 | 7.77 |
1997 | 1 | 1001 | c | 1344 | 7.77 |
1997 | 1 | 1001 | d | 1278 | 7.77 |
1997 | 2 | 1001 | a | 1900 | 8.88 |
1997 | 2 | 1001 | b | 2000 | 8.88 |
1997 | 2 | 1001 | c | 1300 | 8.88 |
1997 | 2 | 1001 | d | 700 | 8.88 |
In general, the calculation equation could be:
(Nj, i, t - Nj, i, t-1)* Pj,t, where Nj,i,t is the number of shares holding by investing firm i in portfolio firm j in quarter t, Pj, t is the stock price for portfolio firm j in quarter t.
It would be much appreciated if some one can show me the stata codes in dealing with such issue.
Best,
Cong
↧
Categorical dependent variable, how to choose the right model
I would like to Regress the likelihood that collateral are pledged on firm variables, loan contract variables and my test variables.
I have in mind to Control for time variant Variation by using time Dummies and to Control for the heterogeneity of Banks by Setting them as Panel variable. However, I would like to compare the normal probit model with the Panel probit model that use random effects per Default and the logit fixed effects model so that I justify my choice by the same comparisons as I did it for my other Analysis, where I used the loan rate as dependent variable (thus I could compare the fixed and the random effects model with the ols Regression by using Regress and xtreg command) Is this Approach for the justification of my model choice useful? This is my dataset:
input float Collateraldummy int Age long Totalassets byte Numberofemployees float Corporationdummy long Grossprofit double(Profitability Leverage) long Loansize byte(Maturity g1 g2 g3) double Duration byte Housebank str6 Loantype
1 8 1500000 28 1 1600000 .0625 .95 475000 10 0 0 1 0 0 "Credit"
0 8 1500000 28 1 1600000 .0625 .95 475000 10 0 0 1 0 0 "Credit"
1 6 500000 15 1 800000 .0875 .5 150000 10 0 0 1 5.75 1 "Credit"
1 6 500000 15 1 800000 .0875 .5 30000 1 0 0 1 5.75 1 "LC"
1 6 500000 15 1 800000 .0875 .5 20000 1 0 0 1 6 1 "LC"
1 23 387000 10 0 815000 .0343558282208589 .72 80000 1 0 1 0 10 1 "LC"
1 24 415000 10 0 830000 .05060240963855422 .77 80000 1 0 1 0 11 1 "LC"
1 25 400000 10 0 850000 .03529411764705882 .9 120000 1 0 1 0 12 1 "LC"
0 24 415000 10 0 830000 .05060240963855422 .77 60000 6 0 1 0 1 0 "Credit"
1 15 800000 25 1 3500000 .03428571428571429 .2 100000 1 0 0 1 4.666666666666667 0 "LC"
1 15 800000 25 1 3500000 .03428571428571429 .2 620000 20 0 0 1 0 0 "Credit"
1 15 800000 25 1 3500000 .03428571428571429 .2 230000 3 0 0 1 5 0 "LC"
0 7 130000 8 0 300000 .23333333333333334 .4 50000 10 1 0 0 4.75 1 "Credit"
0 1 60000 3 0 190000 0 0 20000 10 1 0 0 0 1 "Credit"
0 7 130000 8 0 300000 .23333333333333334 .4 15000 3 1 0 0 3 0 "LC"
1 20 450000 12 1 800000 .08125 .26 50000 10 0 1 0 10.083333333333334 0 "Credit"
1 18 462000 12 1 830000 .0819277108433735 .32 125000 5 0 1 0 8 0 "Credit"
1 19 438000 12 1 755000 .07549668874172186 .3 100000 5 0 1 0 0 0 "Credit"
1 20 450000 12 1 800000 .08125 .26 15000 1 0 1 0 10 0 "LC"
1 19 438000 12 1 755000 .07549668874172186 .3 15000 1 0 1 0 9 0 "LC"
1 18 462000 12 1 830000 .0819277108433735 .32 15000 1 0 1 0 8 0 "LC"
1 19 438000 12 1 755000 .07549668874172186 .3 120000 1 0 1 0 10 0 "LC"
1 18 462000 12 1 830000 .0819277108433735 .32 120000 1 0 1 0 9 0 "LC"
0 20 450000 12 1 800000 .08125 .26 10000 1 0 1 0 10.583333333333334 0 "LC"
1 15 320000 10 1 1000000 .08 .55 70000 6 1 0 0 7 0 "Credit"
1 15 320000 10 1 1000000 .08 .55 100000 5 1 0 0 5.166666666666667 0 "Credit"
1 10 277000 12 1 800000 .09375 .6 150000 4 1 0 0 5.083333333333333 1 "Credit"
1 18 720000 25 1 1800000 .11388888888888889 .45 350000 3 1 0 0 12 1 "Credit"
0 20 695000 25 1 2000000 .105 .45 300000 6 1 0 0 14 1 "Credit"
1 3 248000 3 1 500000 .11 .44 30000 4 0 1 0 0 0 "Credit"
1 4 250000 3 1 600000 .08333333333333333 .5 50000 5 0 1 0 1.33 0 "Credit"
0 3 248000 3 1 500000 .11 .44 8000 1 0 1 0 0 0 "LC"
0 4 250000 3 1 600000 .08333333333333333 .5 8000 1 0 1 0 1 0 "LC"
0 4 250000 3 1 600000 .08333333333333333 .5 10000 3 0 1 0 1.083 0 "LC"
1 2 462000 25 1 1750000 .022857142857142857 .45 100000 1 0 1 0 0 0 "LC"
1 3 450000 29 1 1900000 .027105263157894736 .5 200000 3 0 1 0 .5833333333333334 0 "LC"
1 3 450000 29 1 1900000 .027105263157894736 .5 100000 1 0 1 0 1 0 "LC"
1 2 462000 25 1 1750000 .022857142857142857 .45 250000 5 0 1 0 0 0 "Credit"
1 4 440000 29 1 2000000 .025 .5 200000 5 0 1 0 1.4166666666666667 0 "Credit"
1 7 360000 9 1 415000 .18795180722891566 .25 15000 1 0 1 0 5 1 "LC"
1 8 350000 9 1 435000 .18620689655172415 .25 25000 1 0 1 0 6 1 "LC"
1 9 345000 9 1 430000 .18604651162790697 .3 15000 1 0 1 0 7 1 "LC"
1 45 1000000 14 0 1450000 .07931034482758621 .6 350000 7 1 0 0 15 1 "Credit"
0 50 1050000 15 0 1500000 .06666666666666667 .7 300000 10 1 0 0 20 1 "Credit"
1 45 1000000 14 0 1450000 .07931034482758621 .6 150000 1 1 0 0 15 1 "LC"
1 46 970000 15 0 1400000 .06785714285714285 .7 150000 1 1 0 0 16.5 1 "LC"
1 47 960000 15 0 1475000 .06779661016949153 .7 150000 1 1 0 0 17.75 1 "LC"
1 7 350000 3 0 400000 .125 .5 20000 1 0 1 0 7 1 "LC"
1 7 350000 3 0 400000 .125 .5 15000 5 0 1 0 7 1 "Credit"
0 25 500000 25 1 1100000 .18181818181818182 .8 150000 10 0 1 0 15 1 "Credit"
0 25 500000 25 1 1100000 .18181818181818182 .8 400000 15 0 1 0 15 1 "Credit"
0 25 500000 25 1 1100000 .18181818181818182 .8 50000 1 0 1 0 15 1 "LC"
0 40 620000 25 0 2000000 .15 .2 150000 10 0 1 0 20 1 "Credit"
0 40 620000 25 0 2000000 .15 .2 50000 1 0 1 0 20 1 "LC"
0 35 380000 12 1 1500000 .06666666666666667 .3 25000 5 0 1 0 15 1 "Credit"
1 4 400000 7 0 950000 .1368421052631579 .25 300000 5 0 1 0 3 1 "Credit"
0 7 425000 9 0 1000000 .123 .2 250000 7 0 1 0 6 1 "Credit"
1 4 400000 7 0 950000 .1368421052631579 .25 50000 1 0 1 0 3 1 "LC"
1 5 415000 8 0 975000 .14358974358974358 .2 80000 1 0 1 0 4.333333333333333 1 "LC"
1 6 410000 9 0 935000 .13368983957219252 .2 80000 1 0 1 0 5.333333333333333 1 "LC"
1 7 425000 9 0 1000000 .123 .2 80000 1 0 1 0 6 1 "LC"
1 102 370000 6 0 427000 .14285714285714285 .42 80000 5 0 1 0 23 1 "Credit"
1 102 370000 6 0 427000 .14285714285714285 .42 30000 1 0 1 0 8 0 "LC"
1 103 375000 6 0 430000 .13953488372093023 .45 45000 1 0 1 0 8.75 0 "LC"
0 102 370000 6 0 427000 .14285714285714285 .42 80000 5 0 1 0 0 0 "Credit"
0 17 3500000 28 1 2875000 .05495652173913043 .38 500000 10 0 0 1 14 1 "Credit"
0 22 3625000 30 1 3000000 .05 .4 400000 7 0 0 1 4 0 "Credit"
1 22 3625000 30 1 3000000 .05 .4 60000 2 0 0 1 5 0 "LC"
1 22 3625000 30 1 3000000 .05 .4 50000 2 0 0 1 .16666666666666666 0 "LC"
0 18 3100000 15 1 2600000 .06538461538461539 .5 150000 3 0 0 1 5 0 "Credit"
0 18 3100000 15 1 2600000 .06538461538461539 .5 130000 4 0 0 1 4 0 "Credit"
0 18 3100000 15 1 2600000 .06538461538461539 .5 50000 2 0 0 1 4 0 "LC"
1 26 2650000 35 1 2300000 .09 .21 300000 5 0 0 1 22 1 "Credit"
1 27 2710000 35 1 2425000 .09278350515463918 .28 250000 7 0 0 1 23 1 "Credit"
0 29 2665000 33 1 2400000 .0875 .25 50000 9 0 0 1 25.25 1 "Credit"
0 30 2700000 33 1 2350000 .08297872340425531 .25 80000 10 0 0 1 26.333333333333332 1 "Credit"
1 27 2710000 34 1 2425000 .09278350515463918 .28 80000 1 0 0 1 23.166666666666668 1 "LC"
1 17 1980000 26 1 1650000 .0893939393939394 .26 325000 10 0 1 0 16 1 "Credit"
0 19 2050000 26 1 1700000 .08941176470588236 .31 150000 8 0 1 0 18.333333333333332 1 "Credit"
0 20 1930000 26 1 1750000 .08857142857142856 .33 220000 5 0 1 0 19.166666666666668 1 "Credit"
0 19 2050000 26 1 1700000 .08941176470588236 .31 80000 1 0 1 0 18.166666666666668 1 "LC"
end
and the Code:
for which model the iterations do not come to an end. Could you please tell me the reason?
Thanks in Advance for your help.
I have in mind to Control for time variant Variation by using time Dummies and to Control for the heterogeneity of Banks by Setting them as Panel variable. However, I would like to compare the normal probit model with the Panel probit model that use random effects per Default and the logit fixed effects model so that I justify my choice by the same comparisons as I did it for my other Analysis, where I used the loan rate as dependent variable (thus I could compare the fixed and the random effects model with the ols Regression by using Regress and xtreg command) Is this Approach for the justification of my model choice useful? This is my dataset:
input float Collateraldummy int Age long Totalassets byte Numberofemployees float Corporationdummy long Grossprofit double(Profitability Leverage) long Loansize byte(Maturity g1 g2 g3) double Duration byte Housebank str6 Loantype
1 8 1500000 28 1 1600000 .0625 .95 475000 10 0 0 1 0 0 "Credit"
0 8 1500000 28 1 1600000 .0625 .95 475000 10 0 0 1 0 0 "Credit"
1 6 500000 15 1 800000 .0875 .5 150000 10 0 0 1 5.75 1 "Credit"
1 6 500000 15 1 800000 .0875 .5 30000 1 0 0 1 5.75 1 "LC"
1 6 500000 15 1 800000 .0875 .5 20000 1 0 0 1 6 1 "LC"
1 23 387000 10 0 815000 .0343558282208589 .72 80000 1 0 1 0 10 1 "LC"
1 24 415000 10 0 830000 .05060240963855422 .77 80000 1 0 1 0 11 1 "LC"
1 25 400000 10 0 850000 .03529411764705882 .9 120000 1 0 1 0 12 1 "LC"
0 24 415000 10 0 830000 .05060240963855422 .77 60000 6 0 1 0 1 0 "Credit"
1 15 800000 25 1 3500000 .03428571428571429 .2 100000 1 0 0 1 4.666666666666667 0 "LC"
1 15 800000 25 1 3500000 .03428571428571429 .2 620000 20 0 0 1 0 0 "Credit"
1 15 800000 25 1 3500000 .03428571428571429 .2 230000 3 0 0 1 5 0 "LC"
0 7 130000 8 0 300000 .23333333333333334 .4 50000 10 1 0 0 4.75 1 "Credit"
0 1 60000 3 0 190000 0 0 20000 10 1 0 0 0 1 "Credit"
0 7 130000 8 0 300000 .23333333333333334 .4 15000 3 1 0 0 3 0 "LC"
1 20 450000 12 1 800000 .08125 .26 50000 10 0 1 0 10.083333333333334 0 "Credit"
1 18 462000 12 1 830000 .0819277108433735 .32 125000 5 0 1 0 8 0 "Credit"
1 19 438000 12 1 755000 .07549668874172186 .3 100000 5 0 1 0 0 0 "Credit"
1 20 450000 12 1 800000 .08125 .26 15000 1 0 1 0 10 0 "LC"
1 19 438000 12 1 755000 .07549668874172186 .3 15000 1 0 1 0 9 0 "LC"
1 18 462000 12 1 830000 .0819277108433735 .32 15000 1 0 1 0 8 0 "LC"
1 19 438000 12 1 755000 .07549668874172186 .3 120000 1 0 1 0 10 0 "LC"
1 18 462000 12 1 830000 .0819277108433735 .32 120000 1 0 1 0 9 0 "LC"
0 20 450000 12 1 800000 .08125 .26 10000 1 0 1 0 10.583333333333334 0 "LC"
1 15 320000 10 1 1000000 .08 .55 70000 6 1 0 0 7 0 "Credit"
1 15 320000 10 1 1000000 .08 .55 100000 5 1 0 0 5.166666666666667 0 "Credit"
1 10 277000 12 1 800000 .09375 .6 150000 4 1 0 0 5.083333333333333 1 "Credit"
1 18 720000 25 1 1800000 .11388888888888889 .45 350000 3 1 0 0 12 1 "Credit"
0 20 695000 25 1 2000000 .105 .45 300000 6 1 0 0 14 1 "Credit"
1 3 248000 3 1 500000 .11 .44 30000 4 0 1 0 0 0 "Credit"
1 4 250000 3 1 600000 .08333333333333333 .5 50000 5 0 1 0 1.33 0 "Credit"
0 3 248000 3 1 500000 .11 .44 8000 1 0 1 0 0 0 "LC"
0 4 250000 3 1 600000 .08333333333333333 .5 8000 1 0 1 0 1 0 "LC"
0 4 250000 3 1 600000 .08333333333333333 .5 10000 3 0 1 0 1.083 0 "LC"
1 2 462000 25 1 1750000 .022857142857142857 .45 100000 1 0 1 0 0 0 "LC"
1 3 450000 29 1 1900000 .027105263157894736 .5 200000 3 0 1 0 .5833333333333334 0 "LC"
1 3 450000 29 1 1900000 .027105263157894736 .5 100000 1 0 1 0 1 0 "LC"
1 2 462000 25 1 1750000 .022857142857142857 .45 250000 5 0 1 0 0 0 "Credit"
1 4 440000 29 1 2000000 .025 .5 200000 5 0 1 0 1.4166666666666667 0 "Credit"
1 7 360000 9 1 415000 .18795180722891566 .25 15000 1 0 1 0 5 1 "LC"
1 8 350000 9 1 435000 .18620689655172415 .25 25000 1 0 1 0 6 1 "LC"
1 9 345000 9 1 430000 .18604651162790697 .3 15000 1 0 1 0 7 1 "LC"
1 45 1000000 14 0 1450000 .07931034482758621 .6 350000 7 1 0 0 15 1 "Credit"
0 50 1050000 15 0 1500000 .06666666666666667 .7 300000 10 1 0 0 20 1 "Credit"
1 45 1000000 14 0 1450000 .07931034482758621 .6 150000 1 1 0 0 15 1 "LC"
1 46 970000 15 0 1400000 .06785714285714285 .7 150000 1 1 0 0 16.5 1 "LC"
1 47 960000 15 0 1475000 .06779661016949153 .7 150000 1 1 0 0 17.75 1 "LC"
1 7 350000 3 0 400000 .125 .5 20000 1 0 1 0 7 1 "LC"
1 7 350000 3 0 400000 .125 .5 15000 5 0 1 0 7 1 "Credit"
0 25 500000 25 1 1100000 .18181818181818182 .8 150000 10 0 1 0 15 1 "Credit"
0 25 500000 25 1 1100000 .18181818181818182 .8 400000 15 0 1 0 15 1 "Credit"
0 25 500000 25 1 1100000 .18181818181818182 .8 50000 1 0 1 0 15 1 "LC"
0 40 620000 25 0 2000000 .15 .2 150000 10 0 1 0 20 1 "Credit"
0 40 620000 25 0 2000000 .15 .2 50000 1 0 1 0 20 1 "LC"
0 35 380000 12 1 1500000 .06666666666666667 .3 25000 5 0 1 0 15 1 "Credit"
1 4 400000 7 0 950000 .1368421052631579 .25 300000 5 0 1 0 3 1 "Credit"
0 7 425000 9 0 1000000 .123 .2 250000 7 0 1 0 6 1 "Credit"
1 4 400000 7 0 950000 .1368421052631579 .25 50000 1 0 1 0 3 1 "LC"
1 5 415000 8 0 975000 .14358974358974358 .2 80000 1 0 1 0 4.333333333333333 1 "LC"
1 6 410000 9 0 935000 .13368983957219252 .2 80000 1 0 1 0 5.333333333333333 1 "LC"
1 7 425000 9 0 1000000 .123 .2 80000 1 0 1 0 6 1 "LC"
1 102 370000 6 0 427000 .14285714285714285 .42 80000 5 0 1 0 23 1 "Credit"
1 102 370000 6 0 427000 .14285714285714285 .42 30000 1 0 1 0 8 0 "LC"
1 103 375000 6 0 430000 .13953488372093023 .45 45000 1 0 1 0 8.75 0 "LC"
0 102 370000 6 0 427000 .14285714285714285 .42 80000 5 0 1 0 0 0 "Credit"
0 17 3500000 28 1 2875000 .05495652173913043 .38 500000 10 0 0 1 14 1 "Credit"
0 22 3625000 30 1 3000000 .05 .4 400000 7 0 0 1 4 0 "Credit"
1 22 3625000 30 1 3000000 .05 .4 60000 2 0 0 1 5 0 "LC"
1 22 3625000 30 1 3000000 .05 .4 50000 2 0 0 1 .16666666666666666 0 "LC"
0 18 3100000 15 1 2600000 .06538461538461539 .5 150000 3 0 0 1 5 0 "Credit"
0 18 3100000 15 1 2600000 .06538461538461539 .5 130000 4 0 0 1 4 0 "Credit"
0 18 3100000 15 1 2600000 .06538461538461539 .5 50000 2 0 0 1 4 0 "LC"
1 26 2650000 35 1 2300000 .09 .21 300000 5 0 0 1 22 1 "Credit"
1 27 2710000 35 1 2425000 .09278350515463918 .28 250000 7 0 0 1 23 1 "Credit"
0 29 2665000 33 1 2400000 .0875 .25 50000 9 0 0 1 25.25 1 "Credit"
0 30 2700000 33 1 2350000 .08297872340425531 .25 80000 10 0 0 1 26.333333333333332 1 "Credit"
1 27 2710000 34 1 2425000 .09278350515463918 .28 80000 1 0 0 1 23.166666666666668 1 "LC"
1 17 1980000 26 1 1650000 .0893939393939394 .26 325000 10 0 1 0 16 1 "Credit"
0 19 2050000 26 1 1700000 .08941176470588236 .31 150000 8 0 1 0 18.333333333333332 1 "Credit"
0 20 1930000 26 1 1750000 .08857142857142856 .33 220000 5 0 1 0 19.166666666666668 1 "Credit"
0 19 2050000 26 1 1700000 .08941176470588236 .31 80000 1 0 1 0 18.166666666666668 1 "LC"
end
and the Code:
Code:
probit Collateraldummy Age Totalassets Numberofemployees Corporationdummy Grossprofit Profitability Leverage Loansize Maturity g1 g3 Duration Housebank if Loantype!="Crédit"xtprobit Collateraldummy Age Totalassets Numberofemployees Corporationdummy Grossprofit Profitability Leverage Loansize Maturity g1 g3 Duration Housebank if Loantype!="Credit"Code:which yield to an R squared of 100% that I cannot explain. Also to use clustered Standard error, what variable do I have to use for the vce command
Thanks in Advance for your help.
↧
Interpreting Panel data coefficient estimates where the variable doesnt change across observations
Hi,
I am looking at the effect of different variables on the recycling rate in England.
I have 311 local authorities in England over 20 quarters and am running a regression including income, population density and household size. I have data on income and population density by quarter for each local authority.
However I was only able to obtain data for household size by year, and this does not separate by local authority it is an average for all the UK (I only have 5 values for household size).
How can I interpret the coefficient on household size?
I am looking at the effect of different variables on the recycling rate in England.
I have 311 local authorities in England over 20 quarters and am running a regression including income, population density and household size. I have data on income and population density by quarter for each local authority.
However I was only able to obtain data for household size by year, and this does not separate by local authority it is an average for all the UK (I only have 5 values for household size).
How can I interpret the coefficient on household size?
↧