Quantcast
Channel: Statalist
Viewing all 65094 articles
Browse latest View live

Industry-based distribution of dependent and independent variable

$
0
0
Hi,

I am using Stata 16, and I am trying to summarize data similar to the way I saw in this paper, whose screenshot I have shared here. They are showing the industry-based distribution of dependent variable(PSD Index) and Independent variable(Lobbying Political Connection Investor Activism). I tried to use the tabulate command on stata, but it gave me the following error
Code:
too many variables specified
I tried to find posts on the forum which might help me with an easier way to it. Is there a simple command in Stata that will help me? If there is not, I am assuming I will have to use the bysort egen method? If you have any suggestion, I would be very grateful.Array
Many thanks,

Shiwani

Logistic odds ratio in loops

$
0
0
Dear statalist,

I'm using a logistic regression in a loop, which I later on use in a coefplot
First I used margins with the command below and this resulted in the first figure:

webuse lbw
global outcomes_binary "low smoke"
global outcomes_continuous "bwt"
global outcomes "$outcomes_binary $outcomes_continuous"
global covariates "age lwt ptl ht ui"

foreach y of global outcomes_binary {
logit `y' $covariates
margins, dydx(*) post
mat b_`y' = r(table)
mat b_`y' = b_`y'[1..6,1...]'
mat b_`y' = b_`y'[1...,1], b_`y'[1...,5], b_`y'[1...,6], b_`y'[1...,4]
}

foreach y of global outcomes_continuous {
reg `y' $covariates
margins, dydx(*) post
mat b_`y' = r(table)
mat b_`y' = b_`y'[1..6,1...]'
mat b_`y' = b_`y'[1...,1], b_`y'[1...,5], b_`y'[1...,6], b_`y'[1...,4]
}

local i = 1
foreach var of global covariates {
mat `var' = J(1,4,0)
foreach y of global outcomes {
mat `var' = `var' \ b_`y'[`i',1...]
}
mat `var' = `var'[2...,1...]
mat rownames `var' = $outcomes_binary $outcomes_continuous $outcomes_ordinal0 $outcomes_ordinal5 $outcomes_ordinal4
mat colnames `var' = `var' ci_l ci_h
local i = `i' + 1
}

coefplot matrix(age[,1]), ci((age[,2] age[,3])) ///
|| matrix(lwt[,1]), ci((lwt[,2] lwt[,3])) ///
|| matrix(ptl[,1]), ci((ptl[,2] ptl[,3])) ///
||, scheme(s1color) xline(0) byopts(row(1)) xlab(-200(200)200 ,format(%9.0f)) mlabel mlabposition(2) format(%9.3f)
Array

I would like to change the marginal effects into odds ratios, but when I try this command:

foreach y of global outcomes_binary {
logistic `y' $covariates, or
mat b_`y' = r(table)
mat b_`y' = b_`y'[1..6,1...]'
mat b_`y' = b_`y'[1...,1], b_`y'[1...,5], b_`y'[1...,6], b_`y'[1...,4]
}

Something goes wrong and the figure is incomplete (smoke & bwt are missing).

Array

Can anyone help me to fix the command for storing odds ratios in loops?
Thanks in advance!

Regards, Anouk

stepwise backward logistic regression using imputed data

$
0
0
Dear all,
I have used ICE to impute 11 0f my variables. i have 6000 observations. i would like to run stepwise backward logistic regression but i get an error: invalid pr
my code is
Code:
mi estimate, stepwise, pr(0.05): logistic Y x1 x2 x3
Does stepwise work with mi estimate. if not is there another way to stepwise backward logistic regression with imputed data

thanks in advance

state IC/ 15.1 on mac

Multinomial ;ogistic regression

$
0
0
Two questions regarding mlogit:

Can the RRR be equated to an odds ratio
How exactly do you interpret goodness of fit using fitstat? Is it the R2? or the P-value of LR test?

Using seemingly unrelated estimation with Driscol-Kraay standard errors

$
0
0

I am trying to run suest for two regressions estimated with Driscoll-Kraay standard errors using the command xtscc as follows
as follows
xtscc y x est store REG1 xtscc z u est store REG2 suest REG1 REG2 I am trying to run suest for two regressions estimated with Driscoll-Kraay standard errors using the command xtscc as follows
xtscc y x est store REG1 xtscc z u est store REG2 suest REG1 REG2
Stata is returning error (r322): REG1 was estimated with a nonstandard vce (Drisc/Kraay) Is there a way to allow Stata to run SUEST with Drisc/Kraay errors

plotting confidence intervals of a regression line in panel data

$
0
0
Dear all,
I am a beginner in Stata and I am trying to find out how to plot confidence intervals of a regression line in panel data,
Can someone please guide me?
Your help is much appreciated,
Best wishes,

Elena.

Creating a new variable

$
0
0
Hello Statalist,

I think I overthinking a rather simple problem with creating a new variable and I thought I post on the forum for some help.

I want to recode some variables for a regression analysis. I have 3 variables for education level edu_hs, edu_hsplus, edu_college. Each variable represents the number of people in a county that has a high school education across multiple years. I want to recode them into edu_all where 1 = edu_hs, 2= edu_hsplus, and 3=college. Not sure if i am conceptualizing it correctly. I also want to apply this logic to following variables gender (male and female are vars), race (each race category is a separate var), age (age groups are separate var).

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int year float(n_county edu_college edu_hs edu_hsplus race_aian race_asian race_black race_latino) long(popfemale popmale)
12 1  8721  4528  31629  337  566 10116 1274  28461  27053
13 1  8721  4528  31629  329  628 10095 1330  28450  26796
14 1  8721  4528  31629  331  603 10245 1364  28452  26943
15 1  8721  4528  31629  322  664 10493 1417  28490  26857
16 1  8721  4528  31629  325  632 10556 1336  28422  26994
12 2 41289 13956 125758 1524 1548 17977 8174  97755  93035
13 2 41289 13956 125758 1571 1729 18341 8395 100053  95487
14 2 41289 13956 125758 1616 1817 18802 8571 102514  97597
15 2 41289 13956 125758 1628 1907 19088 8709 104423  99286
16 2 41289 13956 125758 1732 2242 18860 8630 107322 101241
12 3  2366  4824  13563  231  122 12756 1229  12599  14602
13 3  2366  4824  13563  209  142 12765 1174  12594  14482
14 3  2366  4824  13563  206  132 12689 1091  12533  14354
15 3  2366  4824  13563  202  118 12529 1044  12341  14148
16 3  2366  4824  13563  209  117 12337  966  12186  13779
12 4  1885  3040  12705  112   31  4994  418  10402  12195
13 4  1885  3040  12705  119   35  4956  421  10354  12158
14 4  1885  3040  12705  124   50  4931  424  10338  12168
15 4  1885  3040  12705  127   52  4980  492  10413  12170
16 4  1885  3040  12705  113   45  4895  521  10498  12145
end
label values n_county n_county
label def n_county 1 "ALABAMA Autauga", modify
label def n_county 2 "ALABAMA Baldwin", modify
label def n_county 3 "ALABAMA Barbour", modify
label def n_county 4 "ALABAMA Bibb", modify

Can you use a principal component analysis and a heckman test together?

$
0
0
For my master thesis I am facing some issues and I would appreciate some advice. I use Stata 16 for Mac. I need to use the principal component analysis to combine four components into one; colonial ties(binary;dummy), shared monetary union (binary;dummy variable), government effectiveness(continuous), voice and accountability (continuous). Now I also need to do a heckman test to control for a biased selection. I am not sure whether or not you can run a command first to do a PCA and then a heckman test? Is that possible? Which one to do first? I cannot find it anywhere and I truly hope someone can help me online! I do not have the database ready because I first want to check if this is possible. Every help is much much appreciated.

boottest: svmat(numer) option with ivreg2 or ivregress

$
0
0
Dear all,
I am trying to use the boottest command after estimating an IV model and requests that the bootstrapped test numerators be saved in return value r(dist). I receive the following r(3301); error:

"boottestModel::makeDistCDR(): 3301 subscript invalid
boottestModel::getdist(): - function returned error
boottest_stata(): - function returned error
<istmt>: - function returned error"

The same message appears in the following data example:
sysuse auto, clear
ivreg2 price i.rep78 (foreign = weight turn trunk)
boottest foreign, svmat(numer)

However, when I use an OLS model, no error occurs:
reg price i.rep78 foreign
boottest foreign, svmat(numer)

Thank you in advance.

Group by/ Aggregating Functions

$
0
0
I have three variables (var1, var2 and var3).
var1 is refers to a number (ex. options: 1 up to 6000000).
var2 is a binary variable (options: A and B).
var3 is also a binary variable (options: Y and Z).

My questions concerns the creation of a new variable, lets say varnew with the following conditions:
FOR EACH var1 value, IF ((var2=A) AND (var3=Y)) AND IF ((var2=B) AND (var3=Y)), return varnew=1
FOR EACH var1 value, IF ((var2=A) AND (var3=Z)) AND IF ((var2=B) AND (var3=Y)), return varnew=2
FOR EACH var1 value, IF ((var2=A) AND (var3=Y)) AND IF ((var2=B) AND (var3=Z)), return varnew=3
FOR EACH var1 value, IF ((var2=A) AND (var3=Z)) AND IF ((var2=B) AND (var3=Z)), return varnew=4

I have difficulties finding a function for the section in red, which I would be a "group by"/"aggregating" function.
Thank you for your help.

foreach loop with different prefixes

$
0
0
I need to do some data manipulation that involves variables with many different alpha prefixes (ranging from 2 to 4 characters each, depending on the prefix) and numeric suffixes. Example: abcvarname10, abcvarname11, abcvarname12, etc. The number of numeric suffixes is relatively small in comparison to the alpha prefixes. I don't mind re-running a foreach loop for each suffix, if needed, but it would be be a pain to do for each prefix.

The bigger problem is I need to be able to reference the prefix, rather than do a "foreach v of varlist *varname10" so I can rely on another variable (abcdiffvar10, abcdiffvar11, abcdiffvar12, etc.) with the same prefix, for the if/then logic to create a new variable. I have created an example dataset with two prefixes (abc, xyz) and three suffixes (10, 11, and 12) to illustrate:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(abcvarname10 abcvarname11 abcvarname12 xyzvarname10 xyzvarname11 xyzvarname12 abcdiffvar10 abcdiffvar11 abcdiffvar12 xyzdiffvar10 xyzdiffvar11 xyzdiffvar12 group10 group11 group12)
      .9592   .7159957   -1.53703 -2.1440585   .3798885   .3884686 -1.0241606     1.495775 -1.5694687  .13208266  -.3498775   2.0339828 3 2 2
-.001805032 -1.5505602  -.5651004  -.8933501   .5635199  1.3552113 -2.8090355 -.0015595085    .602478   1.663722   .8201531    -.213778 2 1 2
  .54407704  -.8898921  2.8639376   .9374043  2.1282556   .3447447 -1.2186233   -.11580702   .7107502   .4416302  1.0002086    1.213635 3 3 3
 .001628714   .9244226  -.5919446  -2.518309 -1.4414136  1.1547347  -.8432435     .5440386    .048463 -1.1719221 -.27158034  -1.6120855 1 2 1
   .3576809   .8722835  -.2342041   .8660253   2.333187  .05336576 -1.3950206     2.406256 .028220505 -.07227423  1.4033577    .4856817 3 1 1
  1.8788676 -.21402645  .20872524 -1.6571213  -.8314571   1.057124   .8532264     .8427366 -1.4255425   .9563338   .9382064   -.3956575 2 3 1
   2.754746 -.35039315  -1.231335    .073458   .9399959  -1.637349  .26943946   -.22187367 -.26735693   .6043082  1.9303683    .5861435 2 1 2
  -.6125968  .12720117 -.37980115  .54026335  -.5378684   1.269977   .5818368 -.0022590598  -.9247593 -.09789592  2.0715008   -.6189131 3 2 3
  .19730793  .09283927   .5130483 -1.1783106   1.235614  -1.547639   2.812786     1.692513   .7419392  .08973765  .15257198   -.7029641 1 2 1
  1.6102238   -1.32811   .3004382  -.4270272 .031537138  1.1253278  -1.017413     .3740282  .07956094   .4631766 -.10889337  -.04049506 3 1 3
  -.8034225  1.0970963  -.7572337   .7986525   .3625358  .08678886 -.20324135     .7761782   .2289512   -2.55835   .7656382    .5525666 1 3 1
  1.0960116   .5551859 -.19588624   3.021777  -.3356278  -.9188327    .423871     1.563664  .01612376   .7539219  1.6262467   -.2576979 1 2 1
  -.4407027  -1.768393  -.4988511 -1.0827125   .3388321 -.12937512   1.485988     .4380077  -.3181014   .6150653   .9826437   .11858756 2 2 2
 -1.0114266   .8094901  .01018971 -.21997806  2.2095382 -1.3319527   .3227326   -.13763705   .4542596  -.4731691   .9599289   2.1930106 2 3 1
  1.0192274 -1.1189178  1.4619453 -.29597977  -.4924176   .2306863  -.2506901    -.2878682      .7594  -.0847074 -.47052205 -.015555364 2 2 1
  1.8719764 -.55332476 -.05717598  -.8350466  1.4447355   .1665975  -.8331214    1.0176448  .12591842  .18582727  1.0535157   -.6509834 1 3 2
   .4235664   .6854676   .8264247    2.09228  .25570896  -.3962188  -.4782639  -.016494058   2.589255  .28730237   .7075169   2.1423028 3 1 2
   .6339538  -.4006999   .7852141  -.5706332  -.3239344 -1.6179186  -.6076015   -1.2095773 -.03081547 -1.4314556  -1.219921   -1.379946 3 2 2
   .4172334   .7721941 -1.4413848 -1.8281108  1.4039243 .003396563  -.8239509   -.51378053  .17522463 -.28829035  -2.331224  -.52669317 1 2 3
 -.52793354   .5094895   1.395941  -.5447627  -.9250359   .9615512   1.777604    .25770998  1.1381559 -.19973695  -.1285255  -1.1253308 2 3 3
  .10534854  1.0146536 -.12238002 -1.3565013  1.0058329   .4528271  1.4848286     .7079188   .7016681  -.7850761  1.6907436    1.574243 3 1 1
   1.450646  -.3069668 -1.5097364  -.7983169  -.6368325  1.0169148   .8406451   -.45722595 -.14876075  1.0287405  1.4049963     -1.6987 2 3 1
 -1.0992253  -.4750325  1.1066369  -.1908189  .11252879 -1.2770315 -1.0325084     .9693564  .05969276  -.4023991  -2.128305  .064246655 2 2 3
  1.3672062  .14184116   1.210563  .51936007  -.5217331 -.51987046   .3538437   -.58625406 -2.1047397 -.39995325  -.8942189   1.4347937 2 2 2
   .5174628   1.272578   .2263926   .2797566  -.3942043   .3336025 -.20628548    -.9177772    .599732  -.6366209  1.3801566    1.388157 3 3 2
end



I would normally do something like:

Code:
foreach v of varlist *varname10 {
   gen `v'_GTE0 = `v'
   replace 'v'_GTE0 = `v' if `prefix'diffvar10 >=0
   egen `v'_GTE0_groupmedian = median(`v'_GTE0), by(group10)
}
But, I don't know how to reference the 'prefix'diffvar (abcdiffvar10, xyzdiffvar10, etc.) in the code above. Is there a way to loop through all of the prefixes instead of doing it this way? Or a more efficient way that I am missing altogether?
I am using Stata/SE 16.1 on a Mac. Thanks in advance for any help/guidance you can offer.

Variance of a ratio with unknown covariance between numerator and denominator

$
0
0
I have two means and their standard errors from a paper. I would like to calculate their ratio and its variance, but I don't have the covariance. But if I could make some assumptions about the correlation and proceed from there by backing out the covariance, I can get a range of estimates. Theory suggests that the correlation is positive. I implemented this approach with the Delta Method and also Fieller's method on some fake data:

#delimit;
clear;
set obs 21;
egen rho = seq(), from(-10) to(10);
replace rho = rho/10;

scalar dR = 1026;
scalar dS = 305;
scalar var_dR = 2026^2;
scalar var_dS = 40^2;

gen cov = rho*sqrt(scalar(var_dR))*sqrt(scalar(var_dS));
gen roas = scalar(dR)/scalar(dS);

/* Delta Method */
gen var_roas = ((scalar(dR)^2)/(scalar(dS)^4))*scalar(var_dS)
+ (1/(scalar(dS)^2))*scalar(var_dR)
- 2*((scalar(dR))/(scalar(dS)^3))*cov;
gen roas_lb = roas - 1.96*sqrt(var_roas);
gen roas_ub = roas + 1.96*sqrt(var_roas);

/* Fieller's Method */
scalar t2 = invt(28,.95);
scalar aa = (scalar(dS)^2) - (scalar(var_dS)*scalar(t2)^2);
gen bb = (2*cov*scalar(t2)^2) - (2*scalar(dR)*scalar(dS));
scalar cc = (scalar(dR)^2) - (var_dR*scalar(t2)^2);
gen rad = sqrt(bb*bb - 4*aa*cc);
gen fi_lb = (-bb - rad) / (2 * aa);
gen fi_ub = (-bb + rad) / (2 * aa);

tw
(line roas rho, lcolor(navy))
(line roas_ub rho, lpattern(dash) lcolor(navy))
(line roas_lb rho, lpattern(dash) lcolor(navy))
(line fi_ub rho, lpattern(dash) lcolor(maroon))
(line fi_lb rho, lpattern(dash) lcolor(maroon))
, legend(label(1 "ROAS") label(2 "Delta Method CI") label(5 "Fieller's Method CI") order(1 2 5) rows(1) span)
xlab(#10, grid) ylab(#10, angle(0) grid)
yline(0, lpattern(solid) lcolor(gs5))
xtitle("Correlation Between dR and dS")
title("dR/dS Ratio With 95% Confidence Interval")
plotregion(fcolor(white) lcolor(white)) graphregion(fcolor(white) lcolor(white));
I am little puzzled by why the intervals look so different, and I am uncertain if I specified t2 correctly (the means are based on 28 days of data).

Am I doing something stupid here? Is there any way to do something like this better (with tighter bounds)?

How i can make columns from list in other variable

$
0
0
Hi,
i have some data with different column from which i wan to make
i am attaching desired output
id name s
1 AMIKACIN S
1 AMOXICLLLIN R
1 AMPLICILLIN S
4 AVELOX R
1 AZTREONAM R
1 CEFIXIME R
1 CEFOPERAZONE/ SULBACTUM S
1 CEFTAZIDIME S
1 CEFTRIAXONE S
1 CIPROFLOXACIN S
1 MEROPENEM R
1 PIPERACILLIN/TAZOBACTAM R
id AMIKACIN AMIK_ss AMOXICLLLIN AMOX_ss AMPLICILLIN AMPLI_ss
1 Done S Done R Done S

How can I label the values of the bars within a twoway bar graph?

$
0
0
Hello!

I am creating a bar graph with confidence intervals(picture attached below). It would be nice if I could label the bars itself so that it shows the values it conforms to. Does anyone have any suggestions? This is the code I used so far:

twoway (bar avg_vote_2010 gender,barwidth(0.8)) ///
(rcap low_val_2010 hi_val_2010 gender), ///
xlabel( 0 "Female" 1 "Male", noticks) ///
ytitle("Voting 2010 Average") by(actualtreatment)

My goal is to label the bars with the values of the variable "avg_vote_2010." This would be similar in effect to the "blabel" option for the "graph bar" command. I would greatly appreciate any suggestions!

Array

Kind suggestion of a methodology for a sensitivity or robustness check of fixed effect

$
0
0
Please, can someone help me in the group, I am doing research in the area of health economics (child nutrition). I have run a fixed effect, random effect and Hausman test suggested that the fixed-effect model is accepted. The result makes sense because it confirms that there is an unobserved individual-specific effect that is associated with the regressor. Now, I want to do a sensitivity test, I have tried IV and control function analysis but the results come out to be negative. Can someone help me with a suggestion of methodology that I can use for robustness check or sensitivity test? ​​​​​​​

Matching parents to children

$
0
0
Hi everyone! I am working with a panel data in which I have both parents and their (adult) children in the Id column. What I would like to do is to match those children with their parents, by using the father and mother identification numbers (variables father and mother). Basically, in my example, I would like to have a line like:

Id (chidlren) year sex birth partner FatherId Fatheryear Fathergender Fbirth Fpartner MotherId .... Mpartner

Where Id (children) are all the Ids for which I have information about both their parents (e.g. father != -5 & mother !=-5).

I think that maybe a loop might work, but I don't know how to approach it.

One major problem is that I don't have the same information for all my IDs: for example ID=1 has 7 years of information, ID=2 5 years and ID=3 6 years.

Is it possible doing something in this case? And if not, if I had the same number of observations for each Id would it be possible?


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long(id year) byte gender int birth long(partner father mother)
1   1994 1 1950       2  -5  -5
1   1995 1 1950       2  -5  -5
1   1996 1 1950       2  -5  -5
1   1997 1 1950       2  -5  -5
1   1998 1 1950       2  -5  -5
1   1999 1 1950       2  -5  -5
1   2000 1 1950       2  -5  -5
2   1994 2 1951       1  -5  -5
2   1995 2 1951       1  -5  -5
2   1996 2 1951       1  -5  -5
2   1997 2 1951       1  -5  -5
2   1998 2 1951       1  -5  -5
3   2001 1 1983      -5   1   2
3   2002 1 1983      -5   1   2
3   2003 1 1983      -5   1   2
3   2004 1 1983      -5   1   2
3   2005 1 1983      -5   1   2
3   2006 1 1983      -5   1   2
end
Thank you very much






ROC/AUC and Cut off point

$
0
0
Dear all

I am trying to design a ROC curve and find the optimal cut off point.
My reference variable is "CardiacD" where 0 is "Heart disease" and 1 is "Valve Disease"
My class variable is "meanagreedvalue" which is a continuous variable.

I am trying to create a ROC CURVE and find the optimal cut off point for "meanagreedvalue" that will help differentiate between Heart disease and Valve disease.

The data is all normally distributed.

My code is :
rocreg CardiacD meanagreedvalue, probit

But when I plot this, I get a totally inverse ROC curve than what I am expecting:




This does not quite make sense to me but I am not sure what mistake I have made.


For the optimal cutoff point I have used the cutpt command

cutpt CardiacD Meanagreedvalue, noadjust


Many thanks in advance and greatly appreciated.

Event study graph

$
0
0
Dear all,

I have this dataset

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float modate long gvkey float trn byte tic_newind float ff48_after
432 1043  .011599372 1 1
433 1043  .006463119 0 1
434 1043  .015808247 0 1
381 1056   .03750573 0 0
382 1056  .064970195 0 0
383 1056   .04966758 0 0
384 1056   .06059147 1 1
385 1056   .04347776 0 1
386 1056   .04417698 0 1
405 1056   .10177388 0 0
406 1056   .05430648 0 0
407 1056   .08125668 0 0
408 1056   .18746273 1 1
409 1056    .1538455 0 1
410 1056   .10503364 0 1
417 1056  .018058633 0 0
418 1056   .03559741 0 0
419 1056   .03681609 0 0
420 1056  .030866826 1 1
421 1056   .04697718 0 1
422 1056   .04775205 0 1
441 1056   .04478774 0 0
442 1056   .03789423 0 0
443 1056   .04094375 0 0
444 1056   .06175835 1 1
445 1056   .04464729 0 1
446 1056   .04800224 0 1
513 1056   .17015287 0 0
514 1056    .2357336 0 0
515 1056   .14718385 0 0
516 1056    .1545227 1 1
517 1056    .1759009 0 1
518 1056    .1343986 0 1
528 1082 .0031883584 1 1
529 1082  .000853499 0 1
530 1082  .002583388 0 1
405 1094  .016653419 0 0
406 1094   .04310413 0 0
407 1094   .11329491 0 0
408 1094   .04042383 1 1
409 1094   .05237905 0 1
410 1094   .08144742 0 1
393 1098   .04450803 0 0
394 1098   .07997139 0 0
395 1098   .03759339 0 0
396 1098   .03249086 1 1
397 1098   .04728978 0 1
398 1098   .04178986 0 1
405 1098    .0338261 0 0
406 1098   .05636624 0 0
407 1098    .0397711 0 0
408 1098   .07092672 1 1
409 1098   .03913527 0 1
410 1098   .05148625 0 1
393 1109   .04508612 0 0
394 1109   .03779129 0 0
395 1109   .09138805 0 0
396 1109   .09223603 1 1
397 1109   .13809524 0 1
398 1109    .2015528 0 1
429 1109       .2722 0 0
430 1109   .08326667 0 0
431 1109       .0976 0 0
432 1109    .4278667 1 1
433 1109        .208 0 1
434 1109   .09806667 0 1
417 1117   .04343989 0 0
418 1117    .0807021 0 0
419 1117   .04842126 0 0
420 1117   .05420671 1 1
421 1117   .04526378 0 1
422 1117  .067621104 0 1
369 1173  .016996872 0 0
370 1173  .018456725 0 0
371 1173   .07711157 0 0
372 1173  .027554745 1 1
373 1173   .08824296 0 1
374 1173   .03563608 0 1
381 1173   .09265567 0 0
382 1173   .07543907 0 0
383 1173   .03996807 0 0
384 1173  .028318584 1 1
385 1173   .03065165 0 1
386 1173   .02392062 0 1
477 1228   .08233494 0 0
478 1228    .7436393 0 0
479 1228   .16652174 0 0
480 1228   .10687242 1 1
481 1228    .2338371 0 1
482 1228    .3952275 0 1
561 1228   .05730356 0 0
562 1228   .15050155 0 0
563 1228   .07313765 0 0
564 1228    .2091151 1 1
565 1228   .13381559 0 1
566 1228   .17893605 0 1
453 1239   .04146822 0 0
454 1239  .024712674 0 0
455 1239  .027694177 0 0
456 1239  .035309017 1 1
end
format %tm modate
The variables are these:
modate = tells me month and year
gvkey = firm identifier
trn = firm turnover
tic_newind= indicator 1 in the month where the firms join a new group
tic_after= indicator equals to 1 the 3 months where the firm joins a new group. 0 the three mon ths before the shift in a new group.

Basically I want to create an event sutdy graph such as
yaxis= average turnover
x axis= -3 month -2 month -1 month +1 month(this is the event study) +2 month +3 month

Since I have to work with average, I would like to have the average for each period (months before and after) and then plotting the graph. So I created these variables

Code:
egen trn_3before= mean(trn) if f3.tic_newind=1
egen trn_2before= mean(trn) if f2.tic_newind=1
egen trn_1before= mean(trn) if f1.tic_newind=1

egen trn_1after= mean(trn) if tic_newind=1
egen trn_2after= mean(trn) if l.tic_newind=1
egen trn_3after= mean(trn) if l2.tic_newind=1
Now I should plot for the event study with a line graph (this part is a bit puzzling).

I hope to have not created confusion

Estimation of margins after estimates use e(sample) does not identity the estimation sample

$
0
0
Hi list,

I am currently working on a project that envolves me using the Mixed command. The model includes a lot of variables and takes several hours to run. Therefore, I thought it would be smart to save the estimates so that I don't have to run the model in the begining of every session. I work on a protected server with no internet and Stata version 16.1

After having estimated my model I write
Code:
 estimates save fullmodel
I notice that a file called fullmodel.ster is created in my working folder.


In a new session - where i have the exact same data loaded that I used when estimating the model - I write
Code:
estimates use fullmodel
estimates esample
Stata tells me that e(sample) is not set (0 assumed).

When I attempt to calculate predicted values for different subgroups using margins I obtain the follwoing error:

" e(sample) does not identitfy the estimation sample r(322);"


Under the documentation for [R] estimates save I read that when I utilize estimation use Stata thinks that none of the observations in the data was used in producing the estimates currently loaded.

It also says that "There are some postestimation statistics that are appropriate only when calculated on the estimation sample. Setting e(sample) to 0 ensures that if you ask for one of them, you will get back a null result"

Is this what I have encountered? Does margins need the sample to run?

The documentation suggests that you can define the sample as everyone who has non-missing values in the variables included in the model. However, I believe that Mixed allows for units to have some missing values in included variables and therefore it would not work in this case.

With basis in this link https://stats.idre.ucla.edu/stata/faq/how-can-i-identify-cases-used-by-an-estimation-command-using-esample/ I try the following after having estimated my model:

Code:
gen sample = e(sample) // generates variable=1 if unit was used to calculate model I believe?
save "data_post_estimation.dta", replace // saves data with the new variable
estimates save fullmodel // saves estimation

In another session where I want to work with the estimation results:

Code:
use "data_post_estimation.dta", clear
estimates use fullmodel
estimates esample: if sample // I believe this tells stata that the sample is the units with the value of 1 in this variable?
Stata allows me to calculate margins: Is this a meaningful solution?
I am not certain I understand what happens - is there a more correct way to go about this?

Best,
Mads

lower number of observation after logit on mi data

$
0
0
Dear all,
I have a set of variables with missing data. total of 6000 observations and 49 variables. i used mi chained equations to impute the data. when i use sum command all the imputed variables show 6000 observations. How ever when the logit command with mi estimate i get only 4742 observation, as if it only used the data before imputation the code i used is:
Code:
 mi estimate, or: logit prediabeties age albumin alkalinephosphatase alt ast bicarbonate bilirubintotal calcium calciumcorrected chloride cholesterol
 total cpeptide creatinineumolml dihydroxyvitamind estradiol ferritin folate freethyroxine freetriiodothyronine hdlcholesterol homocysteine insulin i
 ron ldlcholesterol magnesium phosphorus potassium sodium testosteronetotal tsh tibc totalprotein triglyceride  urea uricacid vitaminb12 waistsize hi
 pssize waisthipratio sbp dbp pulse  bmi
My independent variable is prediabetes and all the predictors are continuous.

I get this:

Array


Stata IC, 15.1 mac
Viewing all 65094 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>