Quantcast
Channel: Statalist
Viewing all 65052 articles
Browse latest View live

Exporting the results of a sorted sum command to a table

$
0
0
Hi,
I am a new STATA user, working with panel data. I want to report on the summary statistics for a variable called obpadj for each year. To do this, I used the command: by year, sort: sum odpadj. I want to now report these results in a table that I can export into word. I have looked at both the esttab and outreg2 functions, which I understand how to use to create tables for regression results and basic summary stats, but I cannot find any information on how to apply it to my particular situation.

I appreciate any help/feedback.

Thank you,
Anthony M.

We're hiring for Stata gurus - New York

$
0
0
Hi all,

Apologies in advance if this post is not appropriate for the forum; I read through the FAQs and could not find specific guidance.

The Behavioral Insights Team in New York is hiring for a Research Associate Advisor / Advisor (exact position depends on experience). If you love data, have an impressive command of econometrics, statistical modelling, experience running randomized controlled trials, and an in-depth knowledge of Stata, we'd love to hear from you: https://lnkd.in/dfmvR2f. This is a great opportunity to join a growing team and apply your skills to solving issues of public import in the US and beyond.

Feel free to send me a private message if you have any questions

Reshape table (pivot table)

$
0
0
Dear stata-users,

I am writing you, because I hope that anyone here could help me out with your knowledge. Until now, I am not be able to solve it on my own by investigating the help from the Stata manual and forums. It would be wonderful if anyone have good advice!

I want to transpose my panel data, to make my situation more clear, I will illustrate my problem with an example of my panel data.
My own data consists of more than 600 banks over almost 10 years, but to make it easy I will illustrate my date with 3 banks for three days. This is the example of the current dataset:

permno date ticker Company name Cusip Ret
1. 10002 02/07/2007 BTFG Banctrust 05978R19 -2.892
2. 10002 03/07/2007 BTFG Banctrust 05978R19 1.434
3. 10002 04/07/2007 BTFG Banctrust 05978R19 2.324
4. 20395 02/07/2007 WJMK Warrior 61778Q19 1.343
5. 20395 03/07/2007 WJMK Warrior 61778Q19 3.233
6. 20395 04/07/2007 WJMK Warrior 61778Q19 2.233
7. 40234 02/07/2007 BJBP Banc Jones 20124B41 3.233
8. 40234 03/07/2007 BJBP Banc Jones 20124B41 -1.343
9.40234 04/07/2007 BJPP Banc Jones 20124B41 1.234

- Permno is an identifier of the company
- Date = Date
- Ret = Return of the shareprice

Now I want to transpose (like a pivot table in excel) the data to the following new data set:
date permno permno permno
10002 20395 40234
1. 02/07/2007 -2.892 1.343 3.233
2. 03/07/2007 1.434 3.233 -1.343
3. 04/07/2007 2.324 2.233 1.234

So in the new situation I want to have the permno (id) in the column. I want the date in the row. And as values, I want ret (the returns of the share prices of the company).

The problem now is when I use the following command: reshape wide ret, i(permno) j(date) , the problem with this command is that id (permno) stays in the row and you can not choose you what you want in the column, row or as values. I have dropped the other variables that I don't need, otherwise the above command won't work.

I hope that someone could help me out with this command to reach my goal. I am looking forward to hearing of someone, that would be awesome.


constraint issues estimating absolute price Rotterdam model

$
0
0
Hello,

I am trying to estimate the absolute price version of the Rotterdam model for major food groups in Canada using annual aggregate price and quantity data. My data ranges from 1995 to 2015, and I have included 7 aggregated food groups (poultry, red meat, fish, fruit and vegetables, dairy, grains, and oils) in my demand system.

My issue is that the homogeneity and adding-up constraints do not appear to be holding when using the sureg command.

To estimate the model, I ran the sureg command and dropped one share equation (for example, oils) to avoid the simultaneity issue thus estimating n-1 share equations. However, the estimates show that the homogeneity and adding-up restrictions on the estimated coefficients don’t hold. This is confirmed when I try to recover the parameters of the omitted equation (oils) by re-estimating the model and dropping a different share equation (for example, grains). In this case, the coefficient estimates for the other share equations (poultry, red meat, fish, fruit and vegetables, dairy) are completely different between the two regressions. In addition, when I try to incorporate a dummy variable to test for structural change after year 2000, the homogeneity and adding-up constraints won’t hold.

The constraints appear to hold if I use the isure command, but only for the original model that does not include the dummy variable. Once the dummy variable is included, the model will not estimate using isure (the iterations go extremely high, i.e. 350+, so I eventually stop the estimation). I’m not sure if its due to the small sample size (t=20) or whether I am missing something when I enter my constraints.

Any help or advice would be greatly appreciated!

Following previous threads, I entered the constraints in the following way (where Dp represents the log change of prices).

/* homogeneity restrictions */

constraint define 1 [Poultry]Dp1 + [Poultry]Dp2 + [Poultry]Dp3 + [Poultry]Dp4 + [Poultry]Dp5 + [Poultry]Dp6 + [Poultry]Dp7 = 0
constraint define 2 [Redmeat]Dp1 + [Redmeat]Dp2 + [Redmeat]Dp3 + [Redmeat]Dp4 + [Redmeat]Dp5 + [Redmeat]Dp6 + [Redmeat]Dp7 = 0
constraint define 3 [Fish]Dp1 + [Fish]Dp2 + [Fish]Dp3 + [Fish]Dp4 + [Fish]Dp5 + [Fish]Dp6 + [Fish]Dp7 = 0
constraint define 4 [Fruit]Dp1 + [Fruit]Dp2 + [Fruit]Dp3 + [Fruit]Dp4 + [Fruit]Dp5 + [Fruit]Dp6 + [Fruit]Dp7 = 0
constraint define 5 [Dairy]Dp1 + [Dairy]Dp2 + [Dairy]Dp3 + [Dairy]Dp4 + [Dairy]Dp5 + [Dairy]Dp6 + [Dairy]Dp7 = 0
constraint define 6 [Grains]Dp1 + [Grains]Dp2 + [Grains]Dp3 + [Grains]Dp4 + [Grains]Dp5 + [Grains]Dp6 + [Grains]Dp7 = 0

/* symmetry restrictions */

constraint define 7 [Poultry]Dp2=[Redmeat]Dp1
constraint define 8 [Poultry]Dp3=[Fish]Dp1
constraint define 9 [Poultry]Dp4=[Fruit]Dp1
constraint define 10 [Poultry]Dp5=[Dairy]Dp1
constraint define 11 [Poultry]Dp6=[Grains]Dp1
constraint define 12 [Redmeat]Dp3=[Fish]Dp2
constraint define 13 [Redmeat]Dp4=[Fruit]Dp2
constraint define 14 [Redmeat]Dp5=[Dairy]Dp2
constraint define 15 [Redmeat]Dp6=[Grains]Dp2
constraint define 16 [Fish]Dp4=[Fruit]Dp3
constraint define 17 [Fish]Dp5=[Dairy]Dp3
constraint define 18 [Fish]Dp6=[Grains]Dp3
constraint define 19 [Fruit]Dp5=[Dairy]Dp4
constraint define 20 [Fruit]Dp6=[Grains]Dp4
constraint define 21 [Dairy]Dp6=[Grains]Dp5

global demand1 "(Poultry: s1 Dp1 Dp2 Dp3 Dp4 Dp5 Dp6 Dp7 DQ, robust nocon)"
global demand2 "(Redmeat: s2 Dp1 Dp2 Dp3 Dp4 Dp5 Dp6 Dp7 DQ, robust nocon)"
global demand3 "(Fish: s3 Dp1 Dp2 Dp3 Dp4 Dp5 Dp6 Dp7 DQ, robust nocon)"
global demand4 "(Fruit: s4 Dp1 Dp2 Dp3 Dp4 Dp5 Dp6 Dp7 DQ, robust nocon)"
global demand5 "(Dairy: s5 Dp1 Dp2 Dp3 Dp4 Dp5 Dp6 Dp7 DQ, robust nocon)"
global demand6 "(Grains: s6 Dp1 Dp2 Dp3 Dp4 Dp5 Dp6 Dp7 DQ, robust nocon)"

sureg $demand1 $demand2 $demand3 $demand4 $demand5 $demand6, const(1-21)


ml_mediation is unrecognized

$
0
0
Dear all,

I am quite new to STATA. I try to run a multilevel mediation model using ml_mediation.

My code is:
ml_mediation, dv(happiness) iv(gini) mv(fairness) l2id(V2)

Somehow it didn't work and I got the message "command ml_mediation is unrecognized." Anyone has an idea of what's going on? I have googled for a while but couldn't figure it out.

My STATA version is 14.2 and updated.

Thanks!

Can the results of Truncated regression be used in Mediation analysis

$
0
0
Dear Statistic specialists,

In Mediation analysis, the paths can be estimated by OLS. In some cases, other methods of estimation (e.g., logistic regression, multilevel modeling, and structural equal modeling) can be used instead of multiple regression (David 2016 ). However, can the results of Truncated regression be used in Mediation analysis as well? To assume that X (predictor), M(mediator), and Y(outcome) are all continuous, and other assumptions met. Thank you very much.

Reference:

David A. Kenny (2016) http://davidakenny.net/cm/mediate.htm

GMM estimation of personalised equations

$
0
0
Dear all,

I am working with a panel dataset on a GMM estimation of some parameters.
Using
Code:
xtreg
I estimated the coefficients of a regression of the form y = Xb + u, then I extracted residuals using
Code:
predict
and saved them.

Following the paper by Hubbard (1994), the model I am trying to estimate assumes that residuals r_it are a linear function of an AR(1) u_it plus an additional error term v_it.
In formulas: r_it = u_it + v_it = a*u_it-1 + e_t + v_it, where e_t is the error of the AR(1) process and a is the autocorrelation coefficient.

To estimate separately all the pieces, the paper uses a GMM estimation equating theoretical moments with empirically estimated moments.
In detail, let C_k be the covariance of delta_r = r_it - r_it-1: E[ (r_it - r_it-1)*(r_it-k - r_it-k-1) ]. Then after some tedious algebra:

C_0 = 2*{sigma_e}/(1+{a})+2*{sigma_u}
C_1 = {sigma_e}*(1-{a})/(1+{a})-{sigma_u}
C_2 = -{a}^1*{sigma_e}*(1-{a})/(1+{a})
...
C_6 = -{a}^5*{sigma_e}*(1-{a})/(1+{a})

Parameters to be estimated are obviously {sigma_e}, {a} and {sigma_u}.
How can I implement this estimation/minimisation problem using the GMM method?

I tried to calculate empirically the series of the covariances:

Code:
gen delta_r = r - l.r
gen mean_r = mean(r)

gen C_0 = (delta_r - mean_r)^2
gen C_1 = (delta_r - mean_r)*(l.delta_r - mean_r)
...
gen C_6 = (delta_r - mean_r)*(l6.delta_r - mean_r)
Then to implement GMM i tried:

Code:
gmm (C_0 - 2*{sigma_e}/(1+{a})+2*{sigma_u}) \\\
         (C_1 + {sigma_e}*(1-{a})/(1+{a})-{sigma_u}) \\\
         (C_2 + {a}*{sigma_e}*100*(1-{a})/(1+{a})) \\\
         ...
        (C_6 + {a}^5*{sigma_e}*(1-{a})/(1+{a})), \\\
instruments(C_0 C_1 C_2 C_3 C_4 C_5 C_6) winitial(identity)
However the code does not work and gives as error:
Code:
could not calculate numerical derivatives -- flat or discontinuous region encountered
r(430);
I suppose this comes from a misspecified command and not the error itself. How can I fix this?

Thanks for the attention, apologies for the long question.

Best,
Luca

Plot HR

$
0
0
How can i plot with a line the hazard ratio of diabetes derived from:
stcox outcome age gender diabetes proteinuria

my idea is to show hr of diabetes on yscale and proteinuria values on x scale?

DID test on panel data

$
0
0
Dear Statalist,

I am new to the forum and I would like some help with my analysis
I am running a fixed effect DID in order to evaluate whether Partner (Partner_ID) performance (estimate by Avgdaystofund_Q) changes after a certain shock (Badgecount).
Moreover, I have in total 16 time periods (T) where the shock happened at T=9. The periods I'd like to include in the analysis are from T=5 to T=13.

That said, I am running the following:
xtreg Avgdaystofund_Q i.Badgecount i.T if T>=5 & T<=13, fe cl(Partner_ID)

My worry is that when I run the model the T that stata drops is T=5; i have the feeling that T=9 should be dropped, but firstly i'm not sure that I'm right, and secondly if I am right I do not know how to "tell" stata to drop T=9 instead

I thank you all in advance for the help!
Bianca

trend analysis

$
0
0
Hello everyone,
I have a panel data with three variables (id is the panel ID, year and v1). I need to regress v1 on time (year) and use 3 years windows to create a time trend regression analysis. I then need to create a new variable v2 to save the regression coefficients.

I used this code:
xtset id year
rolling v2=_b[year], window(3) : regress v1 year

the coefficients I got are larger than one and I have no explanation. Is my code correct? If yes, how I can explain the correlation coefficients. Here are my data and my results.

Thank you very much for your help.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input double(year v1) float id
1970 1442 1
1971 4700 1
1972 5200 1
1973 6002 1
1974 6400 1
1975 3900 1
1976 3400 1
1977 3146 1
1978 1354 1
1979 7685 1
1980 3897 1
1981 6574 1
1982 7568 1
1983 4657 1
1984 9867 1
1985 6455 1
1986 9823 1
1987 8975 1
1988 5684 1
1970 4546 2
1971 8756 2
1972 4354 2
1973 9864 2
1974 4354 2
1975 7645 2
1976 1654 2
1977 9647 2
end


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float id double(start end) float v2
1 1970 1972    1879
1 1971 1973     651
1 1972 1974     600
1 1973 1975   -1051
1 1974 1976   -1500
1 1975 1977    -377
1 1976 1978   -1023
1 1977 1979  2269.5
1 1978 1980  1271.5
1 1979 1981  -555.5
1 1980 1982  1835.5
1 1981 1983  -958.5
1 1982 1984  1149.5
1 1983 1985     899
1 1984 1986     -22
1 1985 1987    1260
1 1986 1988 -2069.5
2 1970 1972     -96
2 1971 1973     554
2 1972 1974       0
2 1973 1975 -1109.5
2 1974 1976   -1350
2 1975 1977    1001
end

spineplot broken?

$
0
0
spineplot is a package for Stata described in this Stata Journal article. It can be installed with
Code:
ssc install spineplot
The help file includes this example:
Code:
sysuse auto
spineplot foreign rep78, xti(frequency, axis(1)) xla(0(10)60, axis(1)) xmti(1/69, axis(1))
When I run this command on my PC I get this graph:

[ATTACH=CONFIG]temp_7794_1495134397769_196[/ATTACH]

Could you please confirm if you get the same graph or if this is supposed to look different? The other examples from the help file produce graphs that look as I would expect. Here is another example:
Code:
spineplot foreign rep78
[ATTACH=CONFIG]temp_7795_1495135098735_564[/ATTACH]

I am asking because I noticed that some old code of mine that uses spineplot is now broken and produces graphs that look similar to the first graph above: the rectangle is empty and all labels on the lower axis have been pushed to the left side. Here is a graph that I created in 2016:

[ATTACH=CONFIG]temp_7789_1495136021844_617[/ATTACH]

Here is the graph that I get today with the same do-file and same dataset used to create the graph above:

[ATTACH=CONFIG]temp_7790_1495136067173_375[/ATTACH]

I tried Stata 13.1 and 14.2 with the same result. I even went back to spineplot 1.0.4, which I had used last year (the current version is 1.1.1), but the graph is still empty. The only reason I can think of is a recent update to Windows 10 version 1703 but it seems odd that this would have an effect on spineplot graphs. I have no problems with other Stata graph commands.

Manipulating Output Tables

$
0
0
Hi Statalist,
I'm outputting results tables for the first time and am new at this so apologies if this question has an obvious answer...

How do I output a summary statistics table either just in Stata or into excel with general the type of format shown in the attached picture?



Where the #s are means and (standard deviations), respectively.

I know how to output such a table for just "alldays". I also found that specifying ", by (weekend)" did split up by days of the week but it (a) inverted the variables so that the variables were along the top as column headers instead of along the side as rows and (b) as a result of this I couldn't figure out how to also include the "all days" stats. Finally, I didn't know how to output the standard deviations in parentheses under the means - is this easy or does it need to be done in some special way? (if it is challenging I can just put SDs in separate columns for now).

I know matrixes are often used for generating tables but I wasn't sure if there is an easy way to do this especially since I am just doing a first pass at summary statistics for a rough draft.

As always, many thanks to Statalist for your invaluable help and immeasurable patience!

Analyzing Pre and Post Data while accounting for time/season

$
0
0
Hello,
I am working on a study in which we want to assess the relationship between maternal vitamin D status during pregnancy (mid gestation) and vitamin D status 6-months postpartum. The data is paired, such that each woman has a measurement at both time points.
Initially, I thought a paired t-test could accomplish this goal. However, then I got thinking that season plays an important role in vitamin D levels, and vitamin D measured during the prenatal period was likely collected during a different season that vitamin D measured 6 months postpartum.
Is there a statistical model that can account for this?
Thank you!
Jill

Mixed 95 confidence intervals

$
0
0
Hi Folks,

I am somewhat new to mixed, and have a question about fairly large confidence intervals. Anything I can do about it? Here is a description of my model and some code. Thanks.

Data: unbalanced panel dataset of depth to water for irrigation wells (from the years 1910-2013), which are situated in counties.
Explanatory variables: At the county level: the population density of the county (which changes by year); At the well level: the altitude of the well, the distance of the well from the river (both of which don't change); and then I have dummy variables for the drought or wet conditions of that particular year (drought1, drought2, wet1, wet2, with wet3 excluded).


Code:
mixed lnper_cdtw100 lnpopden_county lnalt_avg lnriv_km2 drought1 drought2 wet1 wet2 ||countyid: lnpopden_county ||id2: lnalt_avg lnriv_km2 , robust

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0: log pseudolikelihood = 21291.913
Iteration 1: log pseudolikelihood = 21350.855
Iteration 2: log pseudolikelihood = 21356.944
Iteration 3: log pseudolikelihood = 21356.946

Computing standard errors:

Mixed-effects regression Number of obs = 32,169


No. of Observations per Group
Group Variable Groups Minimum Average Maximum

countyid 31 1 1,037.7 5,400
id2 2,300 1 14.0 70


Wald chi2(7) = 47.18
Log pseudolikelihood = 21356.946 Prob > chi2 = 0.0000

(Std. Err. adjusted for 31 clusters in countyid)

Robust
lnper_cdtw100 Coef. Std. Err. z P>z [95% Conf. Interval]

lnpopden_county .0014967 .0008484 1.76 0.078 -.0001661 .0031595
lnalt_avg -.0164247 .0072642 -2.26 0.024 -.0306622 -.0021872
lnriv_km2 .0014593 .0008141 1.79 0.073 -.0001362 .0030549
drought1 .0074959 .0020855 3.59 0.000 .0034083 .0115834
drought2 .0044572 .0070613 0.63 0.528 -.0093827 .0182971
wet0 -.2147777 .0598072 -3.59 0.000 -.3319976 -.0975578
wet1 .018222 .0059625 3.06 0.002 .0065358 .0299082
_cons 4.757492 .0614828 77.38 0.000 4.636988 4.877996

Robust
Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]

countyid: Independent
var(lnpopd~y) 1.54e-06 .0000194 3.16e-17 75331.21
var(_cons) .0000168 .0000483 6.06e-08 .0046738

id2: Identity
var(_cons) 7.06e-16 9.36e-14 1.2e-128 4.21e+97

var(Residual) .0155087 .0027515 .0109536 .0219581

Creating an Index

$
0
0
Hello, I have four variables which are supposedly measuring 4 dimensions of the same concept. I want to create an index in Stata. I have no idea what is the best way of doing it. The response scale is the same for all 5 and because it is an experimental data, only people in the third experimental group get to answer these questions. Thank you.



----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input double(ener_sec4 ener_sec3 ener_sec2 ener_sec1)
. . . .
. . . .
. . . .
. . . .
. . . .
5 4 2 2
5 4 3 3
5 5 4 4
. . . .
. . . .
5 5 5 5
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
5 5 4 4
5 5 5 5
. . . .
. . . .
. . . .
. . . .
. . . .
2 5 2 1
5 5 4 4


end
label values ener_sec4 P46_3_4
label def P46_3_4 1 "Sin importancia en absoluto", modify
label def P46_3_4 2 "Algo sin importancia", modify
label def P46_3_4 3 "Ni importante ni sin importancia", modify
label def P46_3_4 4 "Algo importante", modify
label def P46_3_4 5 "Extremadamente importante", modify
label values ener_sec3 P46_3_3
label def P46_3_3 1 "Sin importancia en absoluto", modify
label def P46_3_3 2 "Algo sin importancia", modify
label def P46_3_3 3 "Ni importante ni sin importancia", modify
label def P46_3_3 4 "Algo importante", modify
label def P46_3_3 5 "Extremadamente importante", modify
label values ener_sec2 P46_3_2
label def P46_3_2 1 "Sin importancia en absoluto", modify
label def P46_3_2 2 "Algo sin importancia", modify
label def P46_3_2 3 "Ni importante ni sin importancia", modify
label def P46_3_2 4 "Algo importante", modify
label def P46_3_2 5 "Extremadamente importante", modify
label values ener_sec1 P46_3_1
label def P46_3_1 1 "Sin importancia en absoluto", modify
label def P46_3_1 2 "Algo sin importancia", modify
label def P46_3_1 3 "Ni importante ni sin importancia", modify
label def P46_3_1 4 "Algo importante", modify
label def P46_3_1 5 "Extremadamente importante", modify
------------------ copy up to and including the previous line ------------------

GMM Problems with xtabond2

$
0
0
I am using a panel data of 27 countries for the period 1995-2015. And I am using the command xtabond2. Please find the results that I get and can somebody please clarify what is wrong? Why are the values missing?

Code:
xtabond2 PATR3 l.PATR3 preelection right left preelectionright preelectionle
> ft pit corporate inflation gdp unemployment govtexp cashtransfers wage waged
>  urbanisation popbelow14, gmm (l.PATR3 gdp preelection, lag (1 3))iv(popbelo
> w14 urbanisation) robust twostep
Favoring space over speed. To switch, type or click on mata: mata set matafavo
> r speed, perm.
Warning: Number of instruments may be large relative to number of observations
> .
Warning: Two-step estimated covariance matrix of moments is singular.
  Using a generalized inverse to calculate optimal weighting matrix for two-st
> ep estimation.
  Difference-in-Sargan/Hansen statistics may be negative.

Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: country                         Number of obs      =       433
Time variable : year                            Number of groups   =        27
Number of instruments = 218                     Obs per group: min =         7
Wald chi2(17) =      5.00                                      avg =     16.04
Prob > chi2   =     0.998                                      max =        19
------------------------------------------------------------------------------
             |              Corrected
       PATR3 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       PATR3 |
         L1. |   .6297856          .        .       .            .           .
             |
 preelection |   .0213907          .        .       .            .           .
       right |  -.0069543   .0240444    -0.29   0.772    -.0540803    .0401718
        left |  -.0249246          .        .       .            .           .
preelecti~ht |  -.0189756          .        .       .            .           .
preelecti~ft |  -.0084816          .        .       .            .           .
         pit |   .6294965          .        .       .            .           .
   corporate |   .2687776   .1471041     1.83   0.068    -.0195412    .5570963
   inflation |  -.0404798          .        .       .            .           .
         gdp |   .0007505   .0003671     2.04   0.041      .000031    .0014701
unemployment |   .2105017          .        .       .            .           .
     govtexp |   -.010638          .        .       .            .           .
cashtransf~s |  -.1837166   1.716918    -0.11   0.915    -3.548814    3.181381
        wage |   .0005086          .        .       .            .           .
       waged |  -.1515122          .        .       .            .           .
urbanisation |  -1.185528          .        .       .            .           .
  popbelow14 |  -.0049856   .0105233    -0.47   0.636    -.0256109    .0156396
       _cons |   .9562768          .        .       .            .           .
------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.(popbelow14 urbanisation)
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/3).(L.PATR3 gdp preelection)
Instruments for levels equation
  Standard
    popbelow14 urbanisation
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    D.(L.PATR3 gdp preelection)
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -1.36  Pr > z =  0.173
Arellano-Bond test for AR(2) in first differences: z =   0.69  Pr > z =  0.493
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(200)  = 234.50  Prob > chi2 =  0.048
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(200)  =   9.20  Prob > chi2 =  1.000
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  GMM instruments for levels
    Hansen test excluding group:     chi2(144)  =   6.46  Prob > chi2 =  1.000
    Difference (null H = exogenous): chi2(56)   =   2.74  Prob > chi2 =  1.000
  iv(popbelow14 urbanisation)
    Hansen test excluding group:     chi2(198)  =   8.40  Prob > chi2 =  1.000
    Difference (null H = exogenous): chi2(2)    =   0.79  Prob > chi2 =  0.672

Confa for confirmatory factor analysis

$
0
0
I have tried using confa (as adviced in previous posts) for a confirmatory factor analysis, so I did something like this:

confa (energy: ener_sec1 ener_sec2 ener_sec3 ener_sec4 )

and it has given me this message. Why? What does it mean?

initial: log likelihood = -36015.811
rescale: log likelihood = -36015.811
rescale eq: log likelihood = -1516.252
could not calculate numerical derivatives
flat or discontinuous region encountered
convergence not achieved
r(430);


I need to conduct a simple confirmatory factor analysis, one laten variable and 4 observed variables Thank you

Removing Single Quotes from Variable Label

$
0
0
I am trying to clean up variable labels - removing spaces and punctuation - in anticipation of turning them into variable names.

When I try to remove single quotes using the following code, I get an error of "too few quotes" and "invalid syntax". Any advice/assistance would be so helpful

Code:
foreach var of varlist * {
 local lab `: var label `var''
 
 if length("`lab'") > 80 {
  local lab `: di substr("`lab'", 1, 79)'
 }
 
 local lab `: di subinstr("`lab'", "%", "Pct",.)'
 local lab `: di subinstr("`lab'", "'", "",.)'
 local lab `: di subinstr("`lab'", ":", "",.)'
 
 label var `var' "`lab'"
}
Best,
Erika

Formatting Commas

$
0
0
What is the command to format a first and last name with a comma. Example, I would like "JohnsonLarry" to be "Johnson,Larry".

Using a loop to create a new variable containing weighted group summary statistics?

$
0
0
Hi, Forum!

I want to calculate a variable containing weighted group summary statistics. I do not want to collapse the data and egen does not support weights.

Here is the loop I have so far:

gen weightedmarried=.
quietly forvalues i = 15/20 {
summarize married [aw=weight] if age == `i' & female==1 & country==2, detail
replace weightedmarried = r(p50) if age == `i'
}

Some more details about the variable I want to create (marriedegyptfemales). I want this variable to represent the proportion of females (female==1) in Egypt (country==2) who are married (married==1) across all ages (age).

I found the following post that is clearly meant to solve this issue, but I still can't seem to get the loop to work. Where did my loop go wrong?
http://www.stata.com/support/faqs/da...ry-statistics/

-Nikola
Viewing all 65052 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>