Quantcast
Channel: Statalist
Viewing all 65064 articles
Browse latest View live

Impose sign constraint in SUR model

$
0
0
Hi

I am trying to do a constraint in the sign of the parameter in a model SUR with command sureg but I do not know how to do it.

Thanks.



How to generate a histogram with 3 variables in it?

$
0
0
Hi,

I am trying to find a way to get three bars in one histogram. I have three numeric variables and would like to present the frequency of each side-by-side on x-axis in histogram. I checked with manuscript but didn't find a way to accomplish this goal.

Very appreciate your help.

Regards,
Mengmeng

Interpreting margins to understand interactions

$
0
0
Hi!
I have a longitudinal linear mixed model with the term:
Code:
c.age##c.age##c.age##c.cov1_between
and some other covariates. Cov1 is a time-varying covariate.
I also run the model a second time with the within component of cov1 in the interaction term, instead of the between component. What I would like to ask, is in what way I should use margins to understand the results of these interactions. I have thought that, in order to interpret the between-term (each persons average level of cov1), I should use:
Code:
margins, at(age=(0 (12) 96) cov1_between=(30 45 60)), atmeans
This will show how persons with a cov1_between of a certain level change values of dependent variable with age.
However, for the within-term, the question is whether a change in cov1 level is associated with a corresponding change in the dependent variable. To understand this, should I use:
Code:
margins, dydx(cov1_within) at(age=0 (12) 96), atmeans
Any advice is greatly appreciated!

Best,
Kjell Weyde


Merging Annual data with monthly data

$
0
0
Hi everyone
I am facing a problem where I have two files. File A has annual data of the following format

Code:
clear
input byte id int(year aseets)
1 2015  8000
1 2016  9000
2 2015 12000
2 2016 13000
end
While file B has monthly data of the following format.
Code:
clear
input byte id int year str3 month byte mid float return
1 2015 "JAN"  1  .05782197
1 2015 "FEB"  2  .05111383
1 2015 "MAR"  3    .087772
1 2015 "APR"  4  .09744865
1 2015 "MAY"  5 .024168964
1 2015 "JUN"  6 .004063002
1 2015 "JUL"  7 .009543985
1 2015 "AUG"  8  .05168914
1 2015 "SEP"  9  .04424915
1 2015 "OCT" 10  .09077155
1 2015 "NOV" 11 .011542545
1 2015 "DEC" 12  .07619212
1 2016 "JAN"  1   .0731537
1 2016 "FEB"  2 .032152444
1 2016 "MAR"  3 .021415345
1 2016 "APR"  4  .04092654
1 2016 "MAY"  5  .07349947
1 2016 "JUN"  6 .005759252
1 2016 "JUL"  7  .08644056
1 2016 "AUG"  8  .04981745
1 2016 "SEP"  9  .07654408
1 2016 "OCT" 10  .06048589
1 2016 "NOV" 11  .05243156
1 2016 "DEC" 12 .021681713
1 2017 "JAN"  1   .0225843
1 2017 "FEB"  2  .07896894
1 2017 "MAR"  3 .037220813
1 2017 "APR"  4 .031387977
1 2017 "MAY"  5  .04828653
1 2017 "JUN"  6  .05450794
1 2017 "JUL"  7 .000512381
1 2017 "AUG"  8  .05626509
1 2017 "SEP"  9    .030741
1 2017 "OCT" 10  .01095728
1 2017 "NOV" 11  .07522747
1 2017 "DEC" 12 .033911105
2 2015 "JAN"  1  .05782197
2 2015 "FEB"  2  .05111383
2 2015 "MAR"  3    .087772
2 2015 "APR"  4  .09744865
2 2015 "MAY"  5 .024168964
2 2015 "JUN"  6 .004063002
2 2015 "JUL"  7 .009543985
2 2015 "AUG"  8  .05168914
2 2015 "SEP"  9  .04424915
2 2015 "OCT" 10  .09077155
2 2015 "NOV" 11 .011542545
2 2015 "DEC" 12  .07619212
2 2016 "JAN"  1   .0731537
2 2016 "FEB"  2 .032152444
2 2016 "MAR"  3 .021415345
2 2016 "APR"  4  .04092654
2 2016 "MAY"  5  .07349947
2 2016 "JUN"  6 .005759252
2 2016 "JUL"  7  .08644056
2 2016 "AUG"  8  .04981745
2 2016 "SEP"  9  .07654408
2 2016 "OCT" 10  .06048589
2 2016 "NOV" 11  .05243156
2 2016 "DEC" 12 .021681713
2 2017 "JAN"  1   .0225843
2 2017 "FEB"  2  .07896894
2 2017 "MAR"  3 .037220813
2 2017 "APR"  4 .031387977
2 2017 "MAY"  5  .04828653
2 2017 "JUN"  6  .05450794
2 2017 "JUL"  7 .000512381
2 2017 "AUG"  8  .05626509
2 2017 "SEP"  9    .030741
2 2017 "OCT" 10  .01095728
2 2017 "NOV" 11  .07522747
2 2017 "DEC" 12 .033911105
end
I want to match the data in File A in calendar year t - 1 (e.g. 2015) with the returns data in file B for July of year t ( say 2016) to June of t + 1 (say 2017). I shall really appreciate if someone can help me with this problem

Looping across dta files to extract a series

$
0
0
I have several dta files with the same indicators. I need to extract the same indicator from each of the dta files. I was thinking of doing a loop, but not sure how to proceed.

Panel data or not?

$
0
0
Hi everybody,

I have an unbalanced panel (?) dataset of the following structure:
Code:
           i      n    iso    year        rec  share |
     |---------------------------------------------|
624. |    19     31   SWE   1982          0   23.6 |
625. |    19     32   SWE   1985          0   21.3 |
626. |    19     33   SWE   1988          0   18.3 |
627. |    19     34   SWE   1991          1   21.9 |
628. |    19     35   SWE   1994          0   22.4 |
     |---------------------------------------------|
629. |    19     36   SWE   1998          0   22.9 |
630. |    19     37   SWE   2002          0   15.3 |
631. |    19     38   SWE   2006          0   26.2 |
632. |    19     39   SWE   2010          0   30.1 |
633. |    19     40   SWE   2014          0   23.3 |
     +---------------------------------------------+
634       20      1   USA   1870          .   50.3 |
635. |    20      2   USA   1872          0   52.7 |
636. |    20      3   USA   1874          1   45.5 |
637. |    20      4   USA   1876          1   47.4 |
638. |    20      5   USA   1878          0   40.7 |
     |---------------------------------------------|
639. |    20      6   USA   1880          0   46.6 |
I want to run a fixed-effects panel regression (xtreg) with time effects -- country-specific trends (i.time) and/or common year effects (i.year) -- of the variable "share" on the recession dummy. "rec"

The problem: "year" refers to election years ("share" is the outcome for a particular party in this election). However, naturally, there are no elections in some years (Sweden votes every third year, USA every other year..).

Do I have to expand the data and linearly interpolate and then do panel regression? Or is this already correct "panel data"?

The problem is that the difference between observations ("vote share in elections") in fact is not the same across countries. Meaning i=37 for Sweden refers to election in year 2002 in Sweden and i=37 for the USA refers to the election in year 1956. As an extreme example.

Isn't this a necessary condition for panel data? In addition, the data is unbalanced. What is the correct procedure here when I want to estimate a regression that controls for country and time effects? Or can I only use the pooled data and do simple OLS?

Thank you very much in advance,
L.

Determining optional lag length using "varsoc" for panel data

$
0
0
I have 100 countries and its time series data from 1990 to 2000.
Want to determine the optimal lag length for the explanatory variables, i know varsoc does it for times series.
But it gives me an error

repeated time values in sample

i think it's because it's panel, the time variable has repeated values.

How can i solve this issue?

Thanks.

Treat Effects and Propensity Matching (postestimation)

$
0
0
Hi,

Regarding the "teffects psmatch" command and related post-estimation matching results:

My understanding is that "tebalance summarize" is meant to serve a similar role as the prior "pstest". I also know of the tebalance box & tebalance plot commands.

However, is there a way to get odds ratios or to do a test on the balance of the covariates? For instance, "tebalance overid" works with other teffects commands, but NOT teffects psmatch.

Please advise.


Thank you,
Laxmi

using reghdfef for partial regression, inconsistent results

$
0
0
/*Hi Statalisters - I had a question using the command reghdfe for partial regression. I am getting inconsistent results with reghdfe and
am not sure what is going on. Would appreciate any help.
See below for code. */

*
* PARTIAL REGRESSION USING REGHDFE, WANT TO ESTIMATE EQUATION 1 BUT ``CAN'T'' INCLUDE VARIABLE GRADE
*
u "http://www.stata-press.com/data/r9/union.dta", clear
* (1) want to run the following regression
reghdfe union grade age south t0, abs(idcode i.idcode#c.year)
* but suppose that I cannot include the variable grade in the model. No worries, I can get get residuals
* for the dependent variable union, regressed on included variables
keep if e(sample)
qui reghdfe union age south t0, abs(idcode i.idcode#c.year) res(unionres)
*and I can get residuals for grade, using again the included variables.
qui reghdfe grade age south t0, abs(idcode i.idcode#c.year) res(graderes)
* regressing union residuals on grade residuals produces the same coefficient as the full model above
reg unionres graderes

*
* REPEAT EXERCISE, REPLACING GRADE FOR t0
*
* here I repeat the exercise, swapping out the excluded variable grade for t0. same thing
u "http://www.stata-press.com/data/r9/union.dta", clear
* (1) want to run regression but can't include t0
reghdfe union grade age south t0, abs(idcode i.idcode#c.year)
keep if e(sample)
*get residuals for depvar
qui reghdfe union age south grade, abs(idcode i.idcode#c.year) res(unionres)
*get residuals for omitted var
qui reghdfe t0 age south grade, abs(idcode i.idcode#c.year) res(t0res)
* regress union residuals on t0 residuals and ... not the same,
reg unionres t0res

* thanks in advance for any help.

New on SSC: -aextlogit- Average elasticities for fixed effects logit

$
0
0
With the usual thanks to Kit Baum, aextlogit is now available on SSC.

aextlogit is a wrapper for xtlogit which estimates the fixed effects logit and reports estimates of the average (semi-) elasticities of Pr(y=1|x,u) with respect to the regressors, and the corresponding standard errors and t-statistics. The method used to compute the (semi-) elasticities was first described by Kitazawa (2012); see Kemp and Santos Silva (2016) for further details.

Please do let me know if you have problems with the files.

Best wishes,

Joao

Stata Out of Sample Forecasting

$
0
0
I am not gettting out of sample forecasting for predict command:
My steps 1) set date format 2) tsappend, add(12) then 3) do the regression 4) then predict say yhat then the forecast just stay in sample and not to the extended 12 future dates

Am I doing something wrong? Any help will be greatly appreciated.

Regards

Adriaan

technique to test validity of IV (instrumental variables) under ivoprobit model

$
0
0
Hello,
I am using ivoprobit regression technique for my empirical analysis. Now i want to test the validity of instrumental variables. Kindly suggest me any tool/technique through which i can check the validity of instrumental variables.

Thanks in advance.

Help on correlated random effects panel

$
0
0
I have a panel of data from 1990 to 2013, with a combination of time-invariant and year dummies, consequently I cannot use fixed effects models. In my re models, I notice that STATA omits most of my i.year dummies, while a couple of years are excluded in the results because of multi-collinearity. I have included an x-bar variable, which is also omitted due to multi-collinearity. Can anyone suggest an explanation for STATA's omission of most of my year fixed effects? Also, should I be bothered about the exclusion of a couple of years, or should I be happy that STATA worked out the model including five year effects?

missing Prob>F in a Clustered Standard Errors Two dimensions

$
0
0
Hello everyone,

I am trying to run a regression with clustered standard errors 2D.
So I use the following routine:
xi: cluster2 Y X1 C_X2 C_X3 X4 X5 C_dummy1 interactionX2_dummy1 C_dummy2 interactionX3_dummy2 lnTA if year>2011, fcluster(municipal _id) tcluster(year)

I am using an unbalanced panel data (325 firms for 4 years). So the clusters are unbalanced too.

************************************************** ************************************************** ************
Linear regression with 2D clustered SEs Number of obs = 502
F( 10, 486) = .
Prob > F = .
Number of clusters (firms) = 227 R-squared = 0.4533
Number of clusters (year) = 4 Root MSE = 0.0000
----------------------------------------------------------------------------------------
Y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
X1 | -1.82e-08 4.78e-09 -3.81 0.000 -2.76e-08 -8.85e-09
C_X2 | 1.41e-08 9.38e-09 1.50 0.133 -4.32e-09 3.25e-08
C_X3 | -5.84e-10 2.78e-10 -2.10 0.036 -1.13e-09 -3.80e-11
X4 | -.1133196 .1284042 -0.88 0.378 -.3656155 .1389762
X5 | -1.03e-09 2.00e-09 -0.51 0.607 -4.97e-09 2.91e-09
C_dummy1| 3.66e-10 3.58e-10 1.02 0.308 -3.38e-10 1.07e-09
C_dummy2 | -2.40e-09 2.88e-09 -0.83 0.405 -8.07e-09 3.26e-09
interactionX2_dummy1 | 1.17e-09 6.05e-09 0.19 0.047 -1.07e-08 1.31e-08
interactionX3_dummy2 | -1.46e-09 2.74e-10 -5.35 0.000 -2.00e-09 -9.26e-10
lnTA | -9.67e-10 2.75e-10 -3.52 0.000 -1.51e-09 -4.28e-10
_cons | 1.62e-08 4.66e-09 3.47 0.001 7.02e-09 2.53e-08
----------------------------------------------------------------------------------------

SE clustered by firms and year
************************************************** ************************************************** ************

I've noticed that some users have the same problem as me (http://www.statalist.org/forums/foru...terpret-prob-f).
So, I 've tried to exclude the unbalanced clusters from my regression but the problem still remains.

Can you suggest me how to deal with this problem?

Thank you in advance
Ioanna

Fama-MacBeth regression without constant

$
0
0
I changed xtfmb.ado file to allow the option 'noconstant'. Anyone interested can verify the validity.

two-way cluster without constant

$
0
0
I changed the command cluster2 to allow the option 'noconstant'. Anyone interested may verify.

How to count distinct strings for disaggregated panel data

$
0
0
Hi,

I just couldn’t find previous threads that are similar to this.

I have a disaggregated panel data that look like this:

importer HScode year value
MYS 1245 2001 54678
MYS 1245 2002 67657
MYS 2460 2001 443
MYS 2460 2002 0
BRN 0410 2001 1455
BRN 0410 2002 2560
BRN 3919 2001 0
BRN 3919 2002 1005

The values of import are disaggregated according to HS 4-digit. The variables are importer (importer), the product categories based on HS code at 4-digit level (HScode), year (year), and the values of import (value).

There questions is:

How can I count the number of product categories based on HScode for each importer with trade only (excluding HScode with value of zero), so that later on I can prepare a summary table something like this:

importer year no_categories
MYS 2001 2
MYS 2002 1
BRN 2001 1
BRN 2002 2

no_categories is the count of categories (distinct strings) based on HS codes with trade (not zero) for each importer and for each year. Thank you for your help.

Strategies for difficult imputation

$
0
0
I'm having trouble getting my imputation to work. My original data are panel data with up to six observations per unit. I have reshaped wide for the purposes of imputing. I don't know if this is part of what's causing my difficulty, but many units joined the panel late, meaning they are completely empty in early waves. I have filled in some of these variables like age and year so that I can put them on the right hand side of my imputation model.

The following imputation model runs:
Code:
mi impute chained (pmm, knn(5)) a1-a6 b1-b6 c1-c6 (ologit) d (logit) e6 =  g h i j age1-age6 year1-year6, add(1) ///
           rseed(12345) by(race) augment
Eventually I want to get the logit part to include e1-e6 and f1-f6, but adding any in addition to e6 results in a failure to converge.

So far I have tried using hard missings for e and f in years a unit was unobserved, which didn't help.

Is there anything else I can try?

dropping firms with few obs

$
0
0

I'm using stata 13 with OS windows 10. I have a data set with firms quarter earnings announcement.
What I need exactly is to guarantee that I have four fiscal quarters for each firm in each fiscal year. I have an idea how to solve this, but I don't know how to code this. Let me give you an example with my data.

permno: id
anndats: earnings announcement day - Annual
anndatsq: earnings annoucement day - quarter

I can identify end of fiscal year when anndats = anndatsq. Thus, what I really need is to have exactly three observations (three anndatsq) between equals anndats and anndatsq.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long permno float(anndats anndatsq)
10001 17408 17210
10001 17408 17288
10001 17408 17408
10001 17805 17484
10001 17805 17576
10001 17805 17667
10001 17805 17805
10001 18352 17853
10001 18352 17987
10001 18352 18032
10001 18352 18123
10001 18352 18213
10001 18352 18352
10001 18722 18399
10001 18722 18490
10001 18722 18581
end
format %td anndats
format %td anndatsq

identifying latest obs

$
0
0
I'm using stata 13 with OS windows 10.

I have a data set with analysts consensus, earnings announcements and fiscal year end. I'd like to identify the last consensus before that given fiscal year ends. This is an example of my data.

Ibtic is firm identifier
pends is end of fiscal year date
anndats_an is analysts consensus date

What I want is the last anndats_an before pends.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str6 ibtic float(pends anndats_an)
"0004" 20453 19949
"0004" 20453 19984
"0004" 20453 20012
"0004" 20453 20047
"0004" 20453 20075
"0004" 20453 20103
end
format %td pends
format %td anndats_an
Viewing all 65064 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>