sample selection and corner solution model

June 23, 2016, 4:26 pm

≫ Next: cmp - Conditional mixed process estimator with multilevel random effects and coefficients

≪ Previous: collapse of all obs vs collapse of a fraction of obs

Hi Statalist, I'm currently working on log-linear model to explain FDI stock that takes zero and strictly positive values (the well known gravity model). Because, I have log linearised my model, I have sample selection problem that I can address with Heckman selection model. Moreover, I have a high presence of zero in the stock level (around 60% percent). After discussing with my supervisor, I don't think my zeros are "true zeros". So I have to use double hurdle models.

However, I don't think double hurdle models work well with log linearised dependant variable. Did someone already had the same problem?

↧

cmp - Conditional mixed process estimator with multilevel random effects and coefficients

June 23, 2016, 5:48 pm

≫ Next: Formula for cluster VCE in areg

≪ Previous: sample selection and corner solution model

Hello all,

I am running the following command:

cmp (hort=area_cult_hh) (techno=area_cult_hh) (lq_final_trim=area_cult_hh) (laborarea=area_cult_hh), ind(4 5 1 1) qui

To get predicted values and marginal effects, I can run predict or margin after the cmp command. The key thing that I want to do is to get marginal effects from equations 2, 3, and 4 that are *conditional* on hort=1 (i.e., on the field – these are field level regressions – having hort on it).

Thanks,
Benedito

↧

Formula for cluster VCE in areg

June 23, 2016, 6:10 pm

≫ Next: conformability error r(503) when using CQIV code for censored quantile instrumental variable regression

≪ Previous: cmp - Conditional mixed process estimator with multilevel random effects and coefficients

From the Stata Manual for Areg:

The number of groups specified in absorb() are included in the degrees of freedom used in the finite-sample adjustment of the cluster–robust VCE estimator. This statement is only valid if the number of groups is small relative to the sample size. (Technically, the number of groups must remain fixed as the sample size grows.) For an estimator that allows the number of groups to grow with the sample size, see the xtreg, fe command in [XT] xtreg.

What is the actual formula used to make the finite-sample adjustment of the areg cluster-robust VEC?

Thanks,
Carl

↧

conformability error r(503) when using CQIV code for censored quantile instrumental variable regression

June 23, 2016, 6:16 pm

≫ Next: OLS with Conley Spatial standard errors

≪ Previous: Formula for cluster VCE in areg

Hi -

I am using the CQIV code for censored quantile instrumental variable regression. Here is the command I am using:

cqiv consumption price characteristics, confidence(boot) bootreps(200) exogenous quantiles(40(1) 95)

The above works fine to produce results, but when I increase the number of bootreps to say 500, 1000, or 2000, I get the following error message in the results window:

conformability error r(503)

I've tried some of the suggestions in related posts, but I still get the error message. Thanks for any advice.

↧

OLS with Conley Spatial standard errors

June 23, 2016, 6:49 pm

≫ Next: Data managing

≪ Previous: conformability error r(503) when using CQIV code for censored quantile instrumental variable regression

Hi everyone,

I desperately need some help here.
I am using x_ols to get corrected standard errors for spatial dependence which works fine.
But I also need adjusted R-squared which is not part of the output. In "ereturn list", e(r2_a) gives a very large number which is far away from the true one (I know this because this is a replication of a famous published paper).
What about Wald test? I feel that "test" command after x_ols doesn't work right because the result is not different from the test after ordinary ols.

Any advice?

↧

Data managing

June 23, 2016, 9:09 pm

≫ Next: melogit "cannot compute an improvement -- discontinuous region encountered" STATA 14.1

≪ Previous: OLS with Conley Spatial standard errors

Hi,
I have a panel data. My dataset isn't complete like fig3.
I want to make it like fig4
Please give me some advice.
Thanks a lot.

Array . Array

↧

melogit "cannot compute an improvement -- discontinuous region encountered" STATA 14.1

June 26, 2016, 7:58 am

≫ Next: Number of observations

≪ Previous: Data managing

Hello everybody,

I checked for update this morning and update my STATA 14 to STATA 14.1. Tonight my melogit doesn't run anymore "cannot compute an improvement -- discontinuous region encountered".
I didn't change anything in my code and the the update is my only explanation.

I read on other forum that it seems that I am not the only one to have this problem with STATA update.

I checked my model building and it started alread with this univariable analysis:
melogit Sum_Mo i.CatQ || Num: || TVD:, binomial(Q_Mo) or

I ran this model in may and the overall p-value of CatQ was 0.0002 and now it doesn't run anymore.
Any advice, please??

↧

Number of observations

June 26, 2016, 8:28 am

≫ Next: Performing Wald test with interaction term

≪ Previous: melogit "cannot compute an improvement -- discontinuous region encountered" STATA 14.1

Hello everyone,

I have an unbalanced panel data set of 65,000 obs of 10000 firms over 1975 - 2000, so there are some firms with 1-2 years of data while some others with 20 years. But when I run xtreg with firm and year fixed effects, the number of observations reported is only 3300 with 1000 groups. Can some one explain why the number of observations is significantly reduced?

Thank you very much for your help.

↧

Performing Wald test with interaction term

June 26, 2016, 8:32 am

≫ Next: Difference between Invariant variables.. Fixed Effect

≪ Previous: Number of observations

I want to perform a Wald test using a regression that looks like this one from a paper of Toolsema and Jacobs (2007):

Array

My problem is that I do not exactly know how to format the regression in such a way that Stata can perform the Wald test. What I am doing is using a dummy to separate the positive from the negative values. However, in that case you get output that looks something like this:

Array

With this output it does not seem possible to perform the Wald test to compare the cases that the dummy is 0 or 1. Does any of you have an idea to do the regression in such a way that the Wald test can be performed (testing that + (a4) is different from - (a5)).

Thank you in advance,

Danny

↧

Difference between Invariant variables.. Fixed Effect

June 26, 2016, 8:34 am

≫ Next: Performing Wald test with interaction term

≪ Previous: Performing Wald test with interaction term

Hi Everyone,

One more question for the day. If I would like to estimate if there is a significant difference of height on weight between male and female, the usual model would be:
xi: regress weight i.female*height The statistical significance of the coefficient of the interaction term and the dummy for gender will tell me if female height effects weight more or less than male. Here is the complete example (http://www.ats.ucla.edu/stat/stata/faq/compreg2.htm) My question is: How can I do this with panel data, fixed effect? the variable gender will be the same for each id and year (xtset id year) and indeed should be dropped out. Any suggestion? I know that in that case random effect should work, but the Hausman test strongly suggest fixed effect. The only thing that I could think of is running a regression for each subgroup 1) xtreg weight height if gender=1, fe
2) xtreg weight height if gender=0, fe
This won't estimate if there is a statistical difference between groups but it will look at the effect of height for ecah group. Thank you Marco

↧

Performing Wald test with interaction term

June 26, 2016, 8:35 am

≫ Next: Sort command and Correlation table

≪ Previous: Difference between Invariant variables.. Fixed Effect

I want to perform a Wald test using a regression that looks like this one from a paper of Toolsema and Jacobs (2007):

[ATTACH=CONFIG]temp_5334_1466954702194_235[/ATTACH]

My problem is that I do not exactly know how to format the regression in such a way that Stata can perform the Wald test. What I am doing is using a dummy to separate the positive from the negative values. However, in that case you get output that looks something like this:

[ATTACH=CONFIG]temp_5335_1466954994341_998[/ATTACH]

With this output it does not seem possible to perform the Wald test to compare the cases that the dummy is 0 or 1. Does any of you have an idea to do the regression in such a way that the Wald test can be performed (testing that + (a4) is different from - (a5)).

Thank you in advance,

Danny

↧

Sort command and Correlation table

June 26, 2016, 9:20 am

≫ Next: Combining two variables into one

≪ Previous: Performing Wald test with interaction term

Dear Statalist,

I have a panel data. I have sorted id by religious denomination (F025) and for each denomination I have created a correlation table using as imput variables: partecipation to religious (F028) services and frequency to pray(F067).

Code:

sort F025
by F025: pwcorr F028 F067 if EU28==1

This is two possible outputs (there are 10 tables like that, once for each religious denomination): Array

The problem is that I would to obtain a unique table with all correlation coefficient (and significancy of them)

Is there a command that I can use in order to obtain that?

I apologise if this question might be a little bit elementary.

Thanks for the attention.

↧

Combining two variables into one

June 26, 2016, 9:39 am

≫ Next: Multilevel modeling with survey data - question

≪ Previous: Sort command and Correlation table

I have 2 variables identifying whether someone has heard/seen a health message. I want to combine them into 1 variable. In the first variable (s339), 2853 people said Yes, they heard/saw the health message. In the second variables (s342) 577 people said yes. So the total in the new variable should be 3430.
Yet, when I use
egen HIVinfo = rowmax(s339 s342)
I get 2973 "Yes" responses not 3430 "Yes" responses

The same when I use the generate and replace commands.
Attached is the dataset and do file

Best
M

↧

Multilevel modeling with survey data - question

June 26, 2016, 11:43 am

≫ Next: Maps

≪ Previous: Combining two variables into one

Hello, I am a political science graduate student, new to the Stata Forum and a first time poster. I am working on a project where I want to run a multilevel model using weighted survey data, but I am running into some issues with which I would greatly appreciate some help.

The data I am using is a national survey with final sample weights (weight) available that take into account the survey’s design effect. For each respondent, I created an “ideological distance” variable (ideo), which is how far away ideologically the respondent is from their member of Congress. For example, if a respondent is very liberal, and her Congressperson is very conservative, ideo would be a very big number. Ideo would be a very small number if they are both very liberal. The model I want to run would, in addition to having other variables, assess the impact of ideo on the likelihood that the respondent approves of their Congressperson (approve – a binary variable). Further, I want to look at only subsets of respondents – specifically, looking at one model including only Democrats, and one model including only Republicans.

The data is hierarchical - with the ideo variable, multiple respondents often come from the same Congressional district (district) and have the same member of Congress, so they are not completely independent). Because of this, I’d like to use a multilevel mixed effects model. However, when I run a survey-weighted model that tries to incorporate this hierarchy, this is the output I get:

svyset [pweight=weight]
svy, subpop(democrats): melogit approve ideo || :
(running melogit on estimation sample)
survey final weights not allowed with multilevel models;
a final weight variable was svyset using the [pw=exp] syntax, but multilevel models require that each stage-level
weight variable is svyset using the stage's corresponding weight() option

I can't provide a district-level weight, though, because a) the survey did not do any stage of sampling at the district level, and b) all I have is final weights. One solution I thought about was to create a weight variable equal to one for Congressional districts, and incorporate that into the svyset command, as such:

gen x = 1
svyset district, weight(x) || _n, weight(weight)
svy, subpop(democrat): melogit approve ideo || district:

Here, I get output that makes some sense, but does running a model like this incorporate the final sample weights correctly? To me, it logically seems like it would, but it's not an assumption I would want to make without asking people smarter than me

Is this a correct way to account for district-level effects while still applying survey weights correctly? If not, how could I account for the hierarchical nature of the data while still applying these final, respondent-level weights correctly? Any advice or thoughts would be helpful; thanks!

- Ryan Strickler

↧

Maps

June 26, 2016, 12:27 pm

≫ Next: Replacing observations with missing values

≪ Previous: Multilevel modeling with survey data - question

I'm trying maps commands but when using "shp2dta" I always obtain an error message in spite to have followed the instructions. I attach a file with more details. Can anybody help me?

↧

Replacing observations with missing values

June 26, 2016, 1:32 pm

≫ Next: ARDL long-run coefficient

≪ Previous: Maps

Dear all,

Suppose i have the following data:

Code:

id          var5     var1   var3
0             5        5      5
50           10       10     10
55           20       20     20
78           30       30     30
98           40       40     40
143          50       50     50
167          60       60     60
184          70       70     70
192          80       80     80
203          90       90     90
219          100     100    100
238          110     110    110
246          120     120    120
252          130     130    130
269          140     140    140
280          150     150    150

I am trying to create a loop that will replace certain observations in vars 5-3 (in that order) with missing values based on values of id (0, 167, 246). However, so far i have been unsuccessful. The data should look like this:

Code:

id               var5       var1       var3
0                 5           .           .  
50               10           .           .
55               20           .           .
78               30           .           .
98               40           .           .
143              50           .           .
167              .           60           .
184              .           70           .  
192              .           80           .
203              .           90           .  
219              .          100           .  
238              .          110           .
246              .             .        120
252              .             .        130
269              .             .        140
280              .             .        150
290              .             .        160
305              .             .        170

Any ideas how i might be able to achieve that?

Thank you.

↧

ARDL long-run coefficient

June 26, 2016, 2:39 pm

≫ Next: Error while importing multiple sheets

≪ Previous: Replacing observations with missing values

Dear all,

I have a question regarding the calculation of long-run coefficient from an ARDL model.

I would like to estimated the following ARDL model and then test whether y_t and x_1t are cointegrated by means of bounds testing approach :

Δy_t = β₀ + Σ β_iΔy_t-i + Σγ_jΔx_1t-j + θ₀y_t-1 + θ₁x_1t-1 + e_t

Code:

regress d.y  c  l.d.y  l(0/1).d.x₁  l.y  l.x₁

From this equation, I know that the long-run coefficient for x₁ is -(θ₁/ θ₀).
(the bounds test say that the two variables of interest are cointegrated)

I would like to know whether this long-run coefficient is equivalent to the long-run coefficient in the corresponding long-run relationship between y_t and x_1t(the levels model):

y_t = β₀ + a₁x_1t + v_t(levels model)

Code:

regress y c  x₁

When I perfomed the estimates, the two different calculations of the long-run coefficient do not give the same results. I am confused and wonder where I could have made a mistake. Or must a₁not necessarily be equal to -(θ₁/ θ₀)?

Thank you so much for your help

Here are the Stata results:

Source \| SS df MS Number of obs = 36
-------------+------------------------------ F( 5, 30) = 16.94
Model \| 63.2903203 5 12.6580641 Prob > F = 0.0000
Residual \| 22.4119664 30 .747065547 R-squared = 0.7385
-------------+------------------------------ Adj R-squared = 0.6949
Total \| 85.7022867 35 2.44863676 Root MSE = .86433

------------------------------------------------------------------------------
D. \|
y \| Coef. Std. Err. t P>\|t\| [95% Conf. Interval]
-------------+----------------------------------------------------------------
y \|
LD. \| .2882683 .1202836 2.40 0.023 .0426164 .5339202
x1 \|
D1. \| -.4332363 .065969 -6.57 0.000 -.567963 -.2985096
LD. \| .1881575 .0844541 2.23 0.034 .0156793 .3606358
y \|
L1. \| -.9508915 .1854258 -5.13 0.000 -1.329582 -.5722014
x1 \|
L1. \| -.0168967 .0079469 -2.13 0.042 -.0331265 -.0006669
_cons \| 2.646874 .6757397 3.92 0.000 1.266829 4.026918

Source \| SS df MS Number of obs = 38
-------------+------------------------------ F( 1, 36) = 8.80
Model \| 15.5955962 1 15.5955962 Prob > F = 0.0053
Residual \| 63.8147884 36 1.77263301 R-squared = 0.1964
-------------+------------------------------ Adj R-squared = 0.1741
Total \| 79.4103846 37 2.14622661 Root MSE = 1.3314

------------------------------------------------------------------------------
y \| Coef. Std. Err. t P>\|t\| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 \| -.029125 .0098192 -2.97 0.005 -.0490393 -.0092108
_cons \| 2.919511 .5483467 5.32 0.000 1.807412 4.03161
------------------------------------------------------------------------------

We can see that -(-.0168967/-.9508915 ) is not equal to -.029125

↧

Error while importing multiple sheets

June 26, 2016, 3:52 pm

≫ Next: Marginal Effect and Fixed Effect with Interaction Dummies

≪ Previous: ARDL long-run coefficient

Hi all,

I am importing 8 sheets in Stata 13 from single excel file using loop, and then saving them in the form of data sets. The loop works perfectly for the first three sheets, however, when it tries to import forth sheet Stata shows the error given below:

_xlshreadstringcolforce(): 3900 out of memory
import_excel_read_strcol(): - function returned error
import_excel_load_file(): - function returned error
import_excel_import_file(): - function returned error
<istmt>: - function returned error
r(3900);

Could anyone guide how to resolve this issue?

Note: I am facing this issue in Stata 13 which I just upgraded in my machine. Previously, I was using Stata 12 and my current dofile was working perfectly fine in it.

Thanks,
Abbas

↧

Marginal Effect and Fixed Effect with Interaction Dummies

June 26, 2016, 3:53 pm

≫ Next: Can I use xttest3 for N=33 and t=8 (panel data)?

≪ Previous: Error while importing multiple sheets

Hi everyone,

This is a follow up from a previous post (http://www.statalist.org/forums/foru...s-fixed-effect), but the question is changing and I feel it deserves its on discussion.

Here is my issue. When I run a fixed effect model with an interaction term ( xtreg weight i.female##c.height, fe) in order to observe if there are significance differences between male's and female's height on weight. I obtain the following significant results : dummy for female is dropps (as expected), the coefficient on the height is -1,088, and the coefficient on the interaction term is -1.010. All significant.

After this I run a marginal effect to look at female and male separatlety using the following command "margins female, dydx(height) noestimcheck". The results are the following:

female =1 the coefficient is -0.77 and it is NOT statistically significant
female =0 the coefficient is -1.088 and it is statistically significant

My question is, if the model xtreg weight i.female##c.height, fe suggested that are statistically significant evidence that female is associated on average with 1.088 units less increase in weight than the same additional unit of height is associated with in males, why is my coefficient on female=1 not significant?

How do I interpret this results? What can I say about female? Their effect on weight when height increases is statistically significantly different from male, but the effect of their height on weight is not?! It is a little bit confusing.

Thank you

Marco

↧

Can I use xttest3 for N=33 and t=8 (panel data)?

June 26, 2016, 5:12 pm

≫ Next: generating markov transition matrix

≪ Previous: Marginal Effect and Fixed Effect with Interaction Dummies

Hello
I read the note the power of modified Wald statistic is very low in the context of fixed effects with "large N, small T" panels

Is it appropriate to use xttest3 in my data? (N=33 and t=8)
The result is
chi2 (33) = 432.18
Prob>chi2 = 0.0000
(Heteroskedasticity)

(Note: I use LM test for heteroskedasticity and get the result homoskedastic).

Thank you very much

↧