Quantcast
Channel: Statalist
Viewing all 65631 articles
Browse latest View live

System GMM - to address reverse causality

$
0
0
Hi everyone,

I am looking to examine the relationship between perceived neighbourhood cohesion (NSC_index) and life satisfaction (lfsato). In my paper I have run an OLS model (as a benchmark), FE model and then I want to run a dynamic panel model using system GMM.

I have 3 key questions regarding System GMM which I will outline below and would greatly appreciate any guidance:
I am running the system GMM model (using stata 18) as follows:

Code:
xtset pipd wave

xtabond2 lfsato lag_lfsato NSC_index income i.age_group_destr age2 jbstat_simple edu_simple marriage_status tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple physical_health mental_health i.wave i.gor_dv [pweight=l_indscus_lw], gmm (lag_lfsato income marriage_status physical_health mental_health, collapse) iv( NSC_index i.age_group_destr age2 jbstat_simple edu_simple tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple i.wave i.gor_dv) nodiffsargan robust small
In terms of the endogenous variables specified by gmm : I have included lagged life satisfaction and then from the literature Piper (2023) states that marriage status, income and health are endogenous with life satisfaction, so I have included these as well. I also conducted a pairwise correlation matrix and VIF including all of my explanatory variables and life satisfaction and found that mental health was also quite highly correlated with life satisfaction so I have included this variable.

I have then included all of the other explanatory variables from my OLS and FE regressions as exogenous instruments specified by iv.

Q1. Is this the correct/valid way to decide which variables are endogenous/exogenous?

Running the above code in Stata generates the following output:

Code:
. xtabond2 lfsato laglfsato3 NSC_index fihhmngrs1_dv i.age_group_destr age2 jbstat_simple edu_si
> mple mastat_simple tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple scsf1_co
> mbined_r sf12mcs_dv i.wave i.gor_dv [pweight=l_indscus_lw], gmm (laglfsato3 fihhmngrs1_dv mast
> at_simple scsf1_combined_r sf12mcs_dv, collapse) iv( NSC_index i.age_group_destr age2 jbstat_s
> imple edu_simple tenure_dummy addrmov_dummy aidhh_dummy hhsize_simple nchild_simple i.wave i.g
> or_dv) nodiffsargan robust small     
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
1b.age_group_destr dropped due to collinearity
7.age_group_destr dropped due to collinearity
1b.wave dropped due to collinearity
3.wave dropped due to collinearity
1b.gor_dv dropped due to collinearity
(sum of weights is 22647.4695)
Warning: Two-step estimated covariance matrix of moments is singular.
  Using a generalized inverse to calculate robust weighting matrix for Hansen test.

Dynamic panel-data estimation, one-step system GMM
------------------------------------------------------------------------------
Group variable: pidp                            Number of obs      =     23378
Time variable : wave                            Number of groups   =      6334
Number of instruments = 53                      Obs per group: min =         1
F(., 6333)    =         .                                      avg =      3.69
Prob > F      =         .                                      max =         4
-----------------------------------------------------------------------------------------------
                              |               Robust
                       lfsato | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
------------------------------+----------------------------------------------------------------
                   laglfsato3 |    .083551   .0159013     5.25   0.000     .0523791    .1147229
                   NSC_index_ |   .1831389   .0231104     7.92   0.000     .1378347    .2284431
                fihhmngrs1_dv |   7.19e-06   5.31e-06     1.35   0.176    -3.22e-06    .0000176
                              |
              age_group_destr |
                       18-24  |   .1436355   .1315885     1.09   0.275    -.1143224    .4015935
                       25-34  |  -.0245395   .1061021    -0.23   0.817    -.2325355    .1834565
                       35-44  |  -.1553759   .0900446    -1.73   0.084    -.3318939     .021142
                       45-54  |  -.2298868   .0702192    -3.27   0.001    -.3675402   -.0922334
                       55-64  |  -.1802464   .0472028    -3.82   0.000    -.2727798    -.087713
                              |
                         age2 |   .0000259   .0000257     1.01   0.314    -.0000245    .0000763
                jbstat_simple |  -.0034113   .0083322    -0.41   0.682    -.0197452    .0129225
                   edu_simple |  -.0472176    .012247    -3.86   0.000    -.0712258   -.0232093
                mastat_simple |  -.0096751   .0375167    -0.26   0.797    -.0832206    .0638704
                 tenure_dummy |   .0607223    .028808     2.11   0.035     .0042489    .1171958
                addrmov_dummy |   .1015179   .0492164     2.06   0.039     .0050371    .1979986
                  aidhh_dummy |  -.1121506   .0495233    -2.26   0.024     -.209233   -.0150682
                hhsize_simple |   -.034727     .02944    -1.18   0.238    -.0924393    .0229853
                nchild_simple |    .017693   .0200045     0.88   0.376    -.0215226    .0569086
             scsf1_combined_r |   .1988844   .0238085     8.35   0.000     .1522117    .2455571
                   sf12mcs_dv |   .0488631   .0020779    23.52   0.000     .0447897    .0529364
                              |
                         wave |
                           2  |  -.0387835   .0257759    -1.50   0.132     -.089313    .0117459
                           4  |   .0262173   .0260069     1.01   0.313    -.0247652    .0771997
                           5  |   .1596027   .0265361     6.01   0.000      .107583    .2116224
                              |
                       gor_dv |
             north west       |   -.011139   .0671737    -0.17   0.868    -.1428223    .1205443
yorkshire and the humber  ..  |   .0114451   .0709075     0.16   0.872    -.1275576    .1504478
             east midlands    |   .0307968   .0670911     0.46   0.646    -.1007246    .1623181
             west midlands    |   .0510337   .0690321     0.74   0.460    -.0842926      .18636
             east of england  |   .0066081   .0640761     0.10   0.918    -.1190027     .132219
                     london   |  -.1073267   .0741767    -1.45   0.148    -.2527381    .0380847
             south east       |   .0033076   .0631424     0.05   0.958     -.120473    .1270882
             south west       |   .0132228   .0638898     0.21   0.836    -.1120228    .1384685
                     wales    |   .0324638   .0703395     0.46   0.644    -.1054255    .1703531
             scotland         |  -.0823215   .0720415    -1.14   0.253    -.2235473    .0589042
     northern ireland         |   .0762113   .0860584     0.89   0.376    -.0924923    .2449148
                              |
                        _cons |   1.236654   .2329817     5.31   0.000     .7799315    1.693377
-----------------------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.(NSC_index_ 1b.age_group_destr 2.age_group_destr 3.age_group_destr
    4.age_group_destr 5.age_group_destr 6.age_group_destr 7.age_group_destr
    age2 jbstat_simple edu_simple tenure_dummy addrmov_dummy aidhh_dummy
    hhsize_simple nchild_simple 1b.wave 2.wave 3.wave 4.wave 5.wave 1b.gor_dv
    2.gor_dv 3.gor_dv 4.gor_dv 5.gor_dv 6.gor_dv 7.gor_dv 8.gor_dv 9.gor_dv
    10.gor_dv 11.gor_dv 12.gor_dv)
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(1/4).(laglfsato3 fihhmngrs1_dv mastat_simple scsf1_combined_r
    sf12mcs_dv) collapsed
Instruments for levels equation
  Standard
    NSC_index_ 1b.age_group_destr 2.age_group_destr 3.age_group_destr
    4.age_group_destr 5.age_group_destr 6.age_group_destr 7.age_group_destr
    age2 jbstat_simple edu_simple tenure_dummy addrmov_dummy aidhh_dummy
    hhsize_simple nchild_simple 1b.wave 2.wave 3.wave 4.wave 5.wave 1b.gor_dv
    2.gor_dv 3.gor_dv 4.gor_dv 5.gor_dv 6.gor_dv 7.gor_dv 8.gor_dv 9.gor_dv
    10.gor_dv 11.gor_dv 12.gor_dv
    _cons
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    D.(laglfsato3 fihhmngrs1_dv mastat_simple scsf1_combined_r sf12mcs_dv)
    collapsed
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -24.57  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =   0.69  Pr > z =  0.490
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(19)   =  41.38  Prob > chi2 =  0.002
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(19)   =  24.70  Prob > chi2 =  0.171
  (Robust, but weakened by many instruments.)
Q2. Does this seem correctly specified?

Q3. I’m a bit concerned that the sargan test is still significant. Should I try and reduce the number of exogenous instruments or lags in my model? Or, is there an alternative way to address this issue?

Thank you in advance for any advice or guidance you may be able to provide. I am very new to statistics and have spent a lot of time reading the documentation in stata and the empirical literature on how to best use GMM but would love some clarification on the above.

Paper cited: Piper, Alan. (2023). What Does Dynamic Panel Analysis Tell Us About Life Satisfaction?. Review of Income and Wealth. 10.1111/roiw.12567.

Many thanks,
Emma

Replace specific values in all rows corresponding to each ID

$
0
0
Hi there - I have a data set with multiple rows per ID and a column with a variable (var1) that varies from row to row. I would like all rows of var1 to be replaced with PTSD Present for every ID, but only if there is at least one observation in var1 that equals PTSD Present for each ID; otherwise, keep it as is. Below is an example of the data. For IDs 1, 2, and 4, I would like all rows to have the value of PTSD present because there is at least one value of PTSD present for the ID. For ID 3, both rows are the same so this should stay as is. I am using Stata/SE 17.0

eoc_id is the ID variable
icd_ptsd_present is a float variable (0=PTSD Absent; 1=PTSD Present)
eoc_id icd_ptsd_present
1 PTSD Absent
1 PTSD Present
1 PTSD Present
1 PTSD Absent
2 PTSD Absent
2 PTSD Absent
2 PTSD Present
3 PTSD Present
3 PTSD Present
4 PTSD Absent
4 PTSD Present
I have tried using the Stata bysort and replace commands indicated below; however, upon manual data inspection, this did not work appropriately within the IDs. For example, within some IDs, all rows for this var were replaced with 'PTSD Absent.'
code: bysort eoc_id (icd_ptsd_present): replace icd_ptsd_present = icd_ptsd_present[1]

Thank you for your help!

Generating Monthly Variable from Daily Rainfall Data

$
0
0
I have daily rainfall data from Indian Meteorological department from 2000 to 2015. The data reports the day of the year (numbered from 1 to 365), latitude, longitude and year (2000 to 2020). The data does not report month of the year. Is there a way to use the day of the year information to generate a variable for month?
Below is an example of the data


Code:
* Example generated by -dataex-. For    more info,    type    help    dataex
clear
input int day float(lat lon rainfall    Year)
1  6.5  66.5 -999 2015
1  6.5 66.75 -999 2015
1  6.5    67 -999 2015
1  6.5 67.25 -999 2015
1  6.5  67.5 -999 2015
1  6.5 67.75 -999 2015
1  6.5    68 -999 2015
1  6.5 68.25 -999 2015
1  6.5  68.5 -999 2015
1  6.5 68.75 -999 2015
1  6.5    69 -999 2015
1  6.5 69.25 -999 2015
1  6.5  69.5 -999 2015
1  6.5 69.75 -999 2015
1  6.5    70 -999 2015
1  6.5 70.25 -999 2015
1  6.5  70.5 -999 2015
1  6.5 70.75 -999 2015
1  6.5    71 -999 2015
1  6.5 71.25 -999 2015
1  6.5  71.5 -999 2015
1  6.5 71.75 -999 2015
1  6.5    72 -999 2015
1  6.5 72.25 -999 2015
1  6.5  72.5 -999 2015
1  6.5 72.75 -999 2015
1  6.5    73 -999 2015
1  6.5 73.25 -999 2015
1  6.5  73.5 -999 2015
1  6.5 73.75 -999 2015
1  6.5    74 -999 2015
1  6.5 74.25 -999 2015
1  6.5  74.5 -999 2015
1  6.5 74.75 -999 2015
1  6.5    75 -999 2015
1  6.5 75.25 -999 2015
1  6.5  75.5 -999 2015
1  6.5 75.75 -999 2015
1  6.5    76 -999 2015
1  6.5 76.25 -999 2015
1  6.5  76.5 -999 2015
1  6.5 76.75 -999 2015
1  6.5    77 -999 2015
1  6.5 77.25 -999 2015
1  6.5  77.5 -999 2015
1  6.5 77.75 -999 2015
1  6.5    78 -999 2015
1  6.5 78.25 -999 2015
1  6.5  78.5 -999 2015
1  6.5 78.75 -999 2015
1  6.5    79 -999 2015
1  6.5 79.25 -999 2015
1  6.5  79.5 -999 2015
1  6.5 79.75 -999 2015
1  6.5    80 -999 2015
1  6.5 80.25 -999 2015
1  6.5  80.5 -999 2015
1  6.5 80.75 -999 2015
1  6.5    81 -999 2015
1  6.5 81.25 -999 2015
1  6.5  81.5 -999 2015
1  6.5 81.75 -999 2015
1  6.5    82 -999 2015
1  6.5 82.25 -999 2015
1  6.5  82.5 -999 2015
1  6.5 82.75 -999 2015
1  6.5    83 -999 2015
1  6.5 83.25 -999 2015
1  6.5  83.5 -999 2015
1  6.5 83.75 -999 2015
1  6.5    84 -999 2015
1  6.5 84.25 -999 2015
1  6.5  84.5 -999 2015
1  6.5 84.75 -999 2015
1  6.5    85 -999 2015
1  6.5 85.25 -999 2015
1  6.5  85.5 -999 2015
1  6.5 85.75 -999 2015
1  6.5    86 -999 2015
1  6.5 86.25 -999 2015
1  6.5  86.5 -999 2015
1  6.5 86.75 -999 2015
1  6.5    87 -999 2015
1  6.5 87.25 -999 2015
1  6.5  87.5 -999 2015
1  6.5 87.75 -999 2015
1  6.5    88 -999 2015
1  6.5 88.25 -999 2015
1  6.5  88.5 -999 2015
1  6.5 88.75 -999 2015
1  6.5    89 -999 2015
1  6.5 89.25 -999 2015
1  6.5  89.5 -999 2015
1  6.5 89.75 -999 2015
1  6.5    90 -999 2015
1  6.5 90.25 -999 2015
1  6.5  90.5 -999 2015
1  6.5 90.75 -999 2015
1  6.5    91 -999 2015
1  6.5 91.25 -999 2015
1  6.5  91.5 -999 2015
1  6.5 91.75 -999 2015
1  6.5    92 -999 2015
1  6.5 92.25 -999 2015
1  6.5  92.5 -999 2015
1  6.5 92.75 -999 2015
1  6.5    93 -999 2015
1  6.5 93.25 -999 2015
1  6.5  93.5 -999 2015
1  6.5 93.75 -999 2015
1  6.5    94 -999 2015
1  6.5 94.25 -999 2015
1  6.5  94.5 -999 2015
1  6.5 94.75 -999 2015
1  6.5    95 -999 2015
1  6.5 95.25 -999 2015
1  6.5  95.5 -999 2015
1  6.5 95.75 -999 2015
1  6.5    96 -999 2015
1  6.5 96.25 -999 2015
1  6.5  96.5 -999 2015
1  6.5 96.75 -999 2015
1  6.5    97 -999 2015
1  6.5 97.25 -999 2015
1  6.5  97.5 -999 2015
1  6.5 97.75 -999 2015
1  6.5    98 -999 2015
1  6.5 98.25 -999 2015
1  6.5  98.5 -999 2015
1  6.5 98.75 -999 2015
1  6.5    99 -999 2015
1  6.5 99.25 -999 2015
1  6.5  99.5 -999 2015
1  6.5 99.75 -999 2015
1  6.5   100 -999 2015
1 6.75  66.5 -999 2015
1 6.75 66.75 -999 2015
1 6.75    67 -999 2015
1 6.75 67.25 -999 2015
1 6.75  67.5 -999 2015
1 6.75 67.75 -999 2015
1 6.75    68 -999 2015
1 6.75 68.25 -999 2015
1 6.75  68.5 -999 2015
1 6.75 68.75 -999 2015
1 6.75    69 -999 2015
1 6.75 69.25 -999 2015
1 6.75  69.5 -999 2015
1 6.75 69.75 -999 2015
1 6.75    70 -999 2015
1 6.75 70.25 -999 2015
1 6.75  70.5 -999 2015
1 6.75 70.75 -999 2015
1 6.75    71 -999 2015
1 6.75 71.25 -999 2015
1 6.75  71.5 -999 2015
1 6.75 71.75 -999 2015
1 6.75    72 -999 2015
1 6.75 72.25 -999 2015
1 6.75  72.5 -999 2015
1 6.75 72.75 -999 2015
1 6.75    73 -999 2015
1 6.75 73.25 -999 2015
1 6.75  73.5 -999 2015
1 6.75 73.75 -999 2015
1 6.75    74 -999 2015
1 6.75 74.25 -999 2015
1 6.75  74.5 -999 2015
1 6.75 74.75 -999 2015
1 6.75    75 -999 2015
1 6.75 75.25 -999 2015
1 6.75  75.5 -999 2015
1 6.75 75.75 -999 2015
1 6.75    76 -999 2015
1 6.75 76.25 -999 2015
1 6.75  76.5 -999 2015
1 6.75 76.75 -999 2015
1 6.75    77 -999 2015
1 6.75 77.25 -999 2015
1 6.75  77.5 -999 2015
1 6.75 77.75 -999 2015
1 6.75    78 -999 2015
1 6.75 78.25 -999 2015
1 6.75  78.5 -999 2015
1 6.75 78.75 -999 2015
1 6.75    79 -999 2015
1 6.75 79.25 -999 2015
1 6.75  79.5 -999 2015
1 6.75 79.75 -999 2015
1 6.75    80 -999 2015
1 6.75 80.25 -999 2015
1 6.75  80.5 -999 2015
1 6.75 80.75 -999 2015
1 6.75    81 -999 2015
1 6.75 81.25 -999 2015
1 6.75  81.5 -999 2015
1 6.75 81.75 -999 2015
1 6.75    82 -999 2015
1 6.75 82.25 -999 2015
1 6.75  82.5 -999 2015
1 6.75 82.75 -999 2015
1 6.75    83 -999 2015
1 6.75 83.25 -999 2015
1 6.75  83.5 -999 2015
1 6.75 83.75 -999 2015
1 6.75    84 -999 2015
1 6.75 84.25 -999 2015
1 6.75  84.5 -999 2015
1 6.75 84.75 -999 2015
1 6.75    85 -999 2015
1 6.75 85.25 -999 2015
1 6.75  85.5 -999 2015
1 6.75 85.75 -999 2015
1 6.75    86 -999 2015
1 6.75 86.25 -999 2015
1 6.75  86.5 -999 2015
1 6.75 86.75 -999 2015
1 6.75    87 -999 2015
1 6.75 87.25 -999 2015
1 6.75  87.5 -999 2015
1 6.75 87.75 -999 2015
1 6.75    88 -999 2015
1 6.75 88.25 -999 2015
1 6.75  88.5 -999 2015
1 6.75 88.75 -999 2015
1 6.75    89 -999 2015
1 6.75 89.25 -999 2015
1 6.75  89.5 -999 2015
1 6.75 89.75 -999 2015
1 6.75    90 -999 2015
1 6.75 90.25 -999 2015
1 6.75  90.5 -999 2015
1 6.75 90.75 -999 2015
1 6.75    91 -999 2015
1 6.75 91.25 -999 2015
1 6.75  91.5 -999 2015
1 6.75 91.75 -999 2015
1 6.75    92 -999 2015
1 6.75 92.25 -999 2015
1 6.75  92.5 -999 2015
1 6.75 92.75 -999 2015
1 6.75    93 -999 2015
1 6.75 93.25 -999 2015
1 6.75  93.5 -999 2015
1 6.75 93.75 -999 2015
1 6.75    94 -999 2015
1 6.75 94.25 -999 2015
1 6.75  94.5 -999 2015
1 6.75 94.75 -999 2015
1 6.75    95 -999 2015
1 6.75 95.25 -999 2015
1 6.75  95.5 -999 2015
1 6.75 95.75 -999 2015
1 6.75    96 -999 2015
1 6.75 96.25 -999 2015
1 6.75  96.5 -999 2015
1 6.75 96.75 -999 2015
1 6.75    97 -999 2015
1 6.75 97.25 -999 2015
1 6.75  97.5 -999 2015
1 6.75 97.75 -999 2015
1 6.75    98 -999 2015
1 6.75 98.25 -999 2015
1 6.75  98.5 -999 2015
1 6.75 98.75 -999 2015
1 6.75    99 -999 2015
1 6.75 99.25 -999 2015
1 6.75  99.5 -999 2015
1 6.75 99.75 -999 2015
1 6.75   100 -999 2015
1    7  66.5 -999 2015
1    7 66.75 -999 2015
1    7    67 -999 2015
1    7 67.25 -999 2015
1    7  67.5 -999 2015
1    7 67.75 -999 2015
1    7    68 -999 2015
1    7 68.25 -999 2015
1    7  68.5 -999 2015
1    7 68.75 -999 2015
1    7    69 -999 2015
1    7 69.25 -999 2015
1    7  69.5 -999 2015
1    7 69.75 -999 2015
1    7    70 -999 2015
1    7 70.25 -999 2015
1    7  70.5 -999 2015
1    7 70.75 -999 2015
1    7    71 -999 2015
1    7 71.25 -999 2015
1    7  71.5 -999 2015
1    7 71.75 -999 2015
1    7    72 -999 2015
1    7 72.25 -999 2015
1    7  72.5 -999 2015
1    7 72.75 -999 2015
1    7    73 -999 2015
1    7 73.25 -999 2015
1    7  73.5 -999 2015
1    7 73.75 -999 2015
1    7    74 -999 2015
1    7 74.25 -999 2015
1    7  74.5 -999 2015
1    7 74.75 -999 2015
1    7    75 -999 2015
1    7 75.25 -999 2015
1    7  75.5 -999 2015
1    7 75.75 -999 2015
1    7    76 -999 2015
1    7 76.25 -999 2015
1    7  76.5 -999 2015
1    7 76.75 -999 2015
1    7    77 -999 2015
1    7 77.25 -999 2015
1    7  77.5 -999 2015
1    7 77.75 -999 2015
1    7    78 -999 2015
1    7 78.25 -999 2015
1    7  78.5 -999 2015
1    7 78.75 -999 2015
1    7    79 -999 2015
1    7 79.25 -999 2015
1    7  79.5 -999 2015
1    7 79.75 -999 2015
1    7    80 -999 2015
1    7 80.25 -999 2015
1    7  80.5 -999 2015
1    7 80.75 -999 2015
1    7    81 -999 2015
1    7 81.25 -999 2015
1    7  81.5 -999 2015
1    7 81.75 -999 2015
1    7    82 -999 2015
1    7 82.25 -999 2015
1    7  82.5 -999 2015
1    7 82.75 -999 2015
1    7    83 -999 2015
1    7 83.25 -999 2015
1    7  83.5 -999 2015
1    7 83.75 -999 2015
1    7    84 -999 2015
1    7 84.25 -999 2015
1    7  84.5 -999 2015
1    7 84.75 -999 2015
1    7    85 -999 2015
1    7 85.25 -999 2015
1    7  85.5 -999 2015
1    7 85.75 -999 2015
1    7    86 -999 2015
1    7 86.25 -999 2015
1    7  86.5 -999 2015
1    7 86.75 -999 2015
1    7    87 -999 2015
1    7 87.25 -999 2015
1    7  87.5 -999 2015
1    7 87.75 -999 2015
1    7    88 -999 2015
1    7 88.25 -999 2015
1    7  88.5 -999 2015
1    7 88.75 -999 2015
1    7    89 -999 2015
1    7 89.25 -999 2015
1    7  89.5 -999 2015
1    7 89.75 -999 2015
1    7    90 -999 2015
1    7 90.25 -999 2015
1    7  90.5 -999 2015
1    7 90.75 -999 2015
1    7    91 -999 2015
1    7 91.25 -999 2015
1    7  91.5 -999 2015
1    7 91.75 -999 2015
1    7    92 -999 2015
1    7 92.25 -999 2015
1    7  92.5 -999 2015
1    7 92.75 -999 2015
1    7    93 -999 2015
1    7 93.25 -999 2015
1    7  93.5 -999 2015
1    7 93.75 -999 2015
1    7    94 -999 2015
1    7 94.25 -999 2015
1    7  94.5 -999 2015
1    7 94.75 -999 2015
1    7    95 -999 2015
1    7 95.25 -999 2015
1    7  95.5 -999 2015
1    7 95.75 -999 2015
1    7    96 -999 2015
1    7 96.25 -999 2015
1    7  96.5 -999 2015
1    7 96.75 -999 2015
1    7    97 -999 2015
1    7 97.25 -999 2015
1    7  97.5 -999 2015
1    7 97.75 -999 2015
1    7    98 -999 2015
1    7 98.25 -999 2015
1    7  98.5 -999 2015
1    7 98.75 -999 2015
1    7    99 -999 2015
1    7 99.25 -999 2015
1    7  99.5 -999 2015
1    7 99.75 -999 2015
1    7   100 -999 2015
1 7.25  66.5 -999 2015
1 7.25 66.75 -999 2015
1 7.25    67 -999 2015
1 7.25 67.25 -999 2015
1 7.25  67.5 -999 2015
1 7.25 67.75 -999 2015
1 7.25    68 -999 2015
1 7.25 68.25 -999 2015
1 7.25  68.5 -999 2015
1 7.25 68.75 -999 2015
1 7.25    69 -999 2015
1 7.25 69.25 -999 2015
1 7.25  69.5 -999 2015
1 7.25 69.75 -999 2015
1 7.25    70 -999 2015
1 7.25 70.25 -999 2015
1 7.25  70.5 -999 2015
1 7.25 70.75 -999 2015
1 7.25    71 -999 2015
1 7.25 71.25 -999 2015
1 7.25  71.5 -999 2015
1 7.25 71.75 -999 2015
1 7.25    72 -999 2015
1 7.25 72.25 -999 2015
1 7.25  72.5 -999 2015
1 7.25 72.75 -999 2015
1 7.25    73 -999 2015
1 7.25 73.25 -999 2015
1 7.25  73.5 -999 2015
1 7.25 73.75 -999 2015
1 7.25    74 -999 2015
1 7.25 74.25 -999 2015
1 7.25  74.5 -999 2015
1 7.25 74.75 -999 2015
1 7.25    75 -999 2015
1 7.25 75.25 -999 2015
1 7.25  75.5 -999 2015
1 7.25 75.75 -999 2015
1 7.25    76 -999 2015
1 7.25 76.25 -999 2015
1 7.25  76.5 -999 2015
1 7.25 76.75 -999 2015
1 7.25    77 -999 2015
1 7.25 77.25 -999 2015
1 7.25  77.5 -999 2015
1 7.25 77.75 -999 2015
1 7.25    78 -999 2015
1 7.25 78.25 -999 2015
1 7.25  78.5 -999 2015
1 7.25 78.75 -999 2015
1 7.25    79 -999 2015
1 7.25 79.25 -999 2015
1 7.25  79.5 -999 2015
1 7.25 79.75 -999 2015
1 7.25    80 -999 2015
1 7.25 80.25 -999 2015
1 7.25  80.5 -999 2015
1 7.25 80.75 -999 2015
1 7.25    81 -999 2015
1 7.25 81.25 -999 2015
1 7.25  81.5 -999 2015
1 7.25 81.75 -999 2015
1 7.25    82 -999 2015
1 7.25 82.25 -999 2015
1 7.25  82.5 -999 2015
1 7.25 82.75 -999 2015
1 7.25    83 -999 2015
1 7.25 83.25 -999 2015
1 7.25  83.5 -999 2015
1 7.25 83.75 -999 2015
1 7.25    84 -999 2015
1 7.25 84.25 -999 2015
1 7.25  84.5 -999 2015
1 7.25 84.75 -999 2015
1 7.25    85 -999 2015
1 7.25 85.25 -999 2015
1 7.25  85.5 -999 2015
1 7.25 85.75 -999 2015
1 7.25    86 -999 2015
1 7.25 86.25 -999 2015
1 7.25  86.5 -999 2015
1 7.25 86.75 -999 2015
1 7.25    87 -999 2015
1 7.25 87.25 -999 2015
1 7.25  87.5 -999 2015
1 7.25 87.75 -999 2015
1 7.25    88 -999 2015
1 7.25 88.25 -999 2015
1 7.25  88.5 -999 2015
1 7.25 88.75 -999 2015
1 7.25    89 -999 2015
1 7.25 89.25 -999 2015
1 7.25  89.5 -999 2015
1 7.25 89.75 -999 2015
1 7.25    90 -999 2015
end

Displaying p-values from Kruskal-Willis tests in a loop

$
0
0
Hi there! I'd really appreciate some feedback on some code I wrote (below). Basically, I'd like to run Kruskal-Wallis tests on a long list of variables, and display only those with p-values less than 0.05.

Code:
sysuse auto, clear
foreach var of varlist price mpg rep78 {
    display "`var'"
    quietly kwallis `var', by(weight)
    if chi2tail(r(df), r(chi2)) < 0.05 {
        local test `test' `var'
    }
}
display `"`test'"'
After running the KW test on each variable individually, I can see that this code seems to have produced the correct result (i.e., no p-values less than 0.05). But it's hard to confirm without having a significant p-value to check against.

Thanks for any/all feedback!

listcoef with ivreg2

$
0
0
Dear Profs and colleagues,

I need the standard deviation of coefficients so I use listcoef after reg var which works. However when I apply listcoef after ivreg2 it says "listcoef does not work with ivreg2". I believe that there might exist a common /way that is compatible with ivreg2. Any ideas are appreciated.
Code:
ivreg2 ln_labor_productivity_w  (hi_nationality  =IV  ) logsize foreign_aff i.year i.sector i.region ltenur lfirmage multi lageworker share_9 share_12 share_uni  ethnic1 ethnic2  ethnic3 ethnic4 ethnic5 ethnic6 ethnic7 ethnic8 , first robust
Cheers,
Paris

Test: sandbox

$
0
0
Help with xxx

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float(logdisclsize electionyear landslide) long ein float year str16 state
 10.84887 0 0 10248780 2011 "ME"
10.694057 0 0 10248780 2013 "ME"
10.795568 1 0 10248780 2014 "ME"
10.870756 0 0 10248780 2015 "ME"
 11.00408 0 0 10248780 2016 "ME"
10.983002 1 0 10248780 2018 "ME"
11.102217 0 0 10248780 2019 "ME"
 10.12475 0 0 10270690 2013 "ME"
11.652548 0 0 10270690 2015 "ME"
  11.8213 0 0 10270690 2016 "ME"
12.106854 0 0 10270690 2017 "ME"
12.061775 1 0 10270690 2018 "ME"
12.056853 0 0 10270690 2019 "ME"
10.518646 0 0 10317679 2012 "ME"
 10.63246 0 0 10317679 2013 "ME"
10.839287 1 0 10317679 2014 "ME"
 10.85532 0 0 10317679 2015 "ME"
10.849357 0 0 10317679 2016 "ME"
10.888838 0 0 10317679 2017 "ME"
11.097547 1 0 10317679 2018 "ME"
end
output:

Code:
. reghdfe logdisclsize electionyear closeelection size  wins_ExecComp wins_leverage w
> ins_ContribReliance, absorb (ein year) cluster(state) 
(dropped 550 singleton observations)
(MWFE estimator converged in 7 iterations)

HDFE Linear regression                            Number of obs   =     29,506
Absorbing 2 HDFE groups                           F(   6,     49) =       2.57
Statistics robust to heteroskedasticity           Prob > F        =     0.0301
                                                  R-squared       =     0.9857
                                                  Adj R-squared   =     0.9827
                                                  Within R-sq.    =     0.0010
Number of clusters (state)   =         50         Root MSE        =     0.5813

                                       (Std. err. adjusted for 50 clusters in state)
------------------------------------------------------------------------------------
                   |               Robust
      logdisclsize | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
      electionyear |   .0172858   .0112746     1.53   0.132    -.0053714     .039943
     closeelection |  -.0052555   .0202606    -0.26   0.796    -.0459708    .0354597
              size |   .0274611   .0123293     2.23   0.031     .0026844    .0522378
     wins_ExecComp |   .0447626   .0661768     0.68   0.502    -.0882246    .1777498
     wins_leverage |   .0610303   .0284481     2.15   0.037     .0038618    .1181989
wins_ContribReli~e |   .0122862     .03129     0.39   0.696    -.0505935    .0751659
             _cons |   7.541963   .1793516    42.05   0.000     7.181543    7.902384
------------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
         ein |      5085           0        5085     |
        year |         9           1           8     |
-----------------------------------------------------+
. reghdfe logdisclsize electionyear closeelection size wins_ExecComp wins_leverage w
> ins_ContribReliance, absorb (ein year) cluster(state)
(dropped 550 singleton observations)
(MWFE estimator converged in 7 iterations)

HDFE Linear regression Number of obs = 29,506
Absorbing 2 HDFE groups F( 6, 49) = 2.57
Statistics robust to heteroskedasticity Prob > F = 0.0301
R-squared = 0.9857
Adj R-squared = 0.9827
Within R-sq. = 0.0010
Number of clusters (state) = 50 Root MSE = 0.5813

(Std. err. adjusted for 50 clusters in state)
------------------------------------------------------------------------------------
| Robust
logdisclsize | Coefficient std. err. t P>|t| [95% conf. interval]
-------------------+----------------------------------------------------------------
electionyear | .0172858 .0112746 1.53 0.132 -.0053714 .039943
closeelection | -.0052555 .0202606 -0.26 0.796 -.0459708 .0354597
size | .0274611 .0123293 2.23 0.031 .0026844 .0522378
wins_ExecComp | .0447626 .0661768 0.68 0.502 -.0882246 .1777498
wins_leverage | .0610303 .0284481 2.15 0.037 .0038618 .1181989
wins_ContribReli~e | .0122862 .03129 0.39 0.696 -.0505935 .0751659
_cons | 7.541963 .1793516 42.05 0.000 7.181543 7.902384
------------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
ein | 5085 0 5085 |
year | 9 1 8 |
-----------------------------------------------------+

Help with regression where interaction term is omitted by Stata

$
0
0
Hello everyone,

I am running a regression to analyze the impact of election years and landslide elections on disclosure size, which the log of website size. My model includes an interaction term between electionyear and landslide, but this interaction term is omitted due to collinearity. Here is a data sample:

(Note on variables - ein is a unique identifier for each organization, logdisclsize is the log of website size, org is the type of organization, electionyear is a binary indicator for whether there was an election, landslide is a binary indicator for whether there was a landslide election, size and the winsorized variables are controls)

Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input float(logdisclsize electionyear landslide size wins_ExecComp wins_leverage wins_ContribReliance) long ein float year str16 state str4 org
 10.84887 0 0  16.34981  .06247739  .01618405  .7696673 10248780 2011 "ME" "ENV"
10.694057 0 0 16.353073  .06970697 .011605377  .6577324 10248780 2013 "ME" "ENV"
10.795568 1 0 16.415674  .08384993  .06066541  .7097442 10248780 2014 "ME" "ENV"
10.870756 0 0 16.435535  .07682364  .05358697 .59489995 10248780 2015 "ME" "ENV"
 11.00408 0 0 16.493631  .10753528  .03700265  .7087289 10248780 2016 "ME" "ENV"
10.983002 1 0   16.5836  .07915805 .037205808 .51900214 10248780 2018 "ME" "ENV"
11.102217 0 0 16.619827 .073426425 .031817265  .6083745 10248780 2019 "ME" "ENV"
 10.12475 0 0 15.689817  .09311032  .02294062  .8391001 10270690 2013 "ME" "ENV"
11.652548 0 0  15.76943  .09187462 .025827337  .8749067 10270690 2015 "ME" "ENV"
  11.8213 0 0 15.754642  .09122185 .035800748  .8071181 10270690 2016 "ME" "ENV"
12.106854 0 0  15.84974  .08144714  .03633184  .9579841 10270690 2017 "ME" "ENV"
12.061775 1 0  16.01647  .07850363 .031095315  .9270397 10270690 2018 "ME" "ENV"
12.056853 0 0 16.306654 .073099725  .02553698   .931366 10270690 2019 "ME" "ENV"
10.518646 0 0  15.52521 .031852446  .08306593  .7954195 10317679 2012 "ME" "REPR"
 10.63246 0 0 15.641062  .03481711   .1558381  .7699555 10317679 2013 "ME" "REPR"
10.839287 1 0 15.705325 .024914693    .180689  .6935283 10317679 2014 "ME" "REPR"
 10.85532 0 0 15.618464  .02366554   .1922071  .7194572 10317679 2015 "ME" "REPR"
10.849357 0 0 15.548765  .02367953  .19987574  .7202281 10317679 2016 "ME" "REPR"
10.888838 0 0 15.575302 .033151954  .18535903  .7151666 10317679 2017 "ME" "REPR"
11.097547 1 0 15.578058 .033256307   .1112484  .7204551 10317679 2018 "ME" "REPR"
end
Here is one of the regressions I run, and the output:

Code:
 reghdfe logdisclsize electionyear##landslide size  wins_ExecComp wins_leverage wins
> _ContribReliance if org == "REPR", absorb (ein year) cluster(state)
(dropped 18 singleton observations)
(MWFE estimator converged in 6 iterations)
note: 1.electionyear#1.landslide omitted because of collinearity

HDFE Linear regression                            Number of obs   =      1,323
Absorbing 2 HDFE groups                           F(   6,     46) =       0.99
Statistics robust to heteroskedasticity           Prob > F        =     0.4457
                                                  R-squared       =     0.9897
                                                  Adj R-squared   =     0.9874
                                                  Within R-sq.    =     0.0048
Number of clusters (state)   =         47         Root MSE        =     0.5070

                                       (Std. err. adjusted for 47 clusters in state)
------------------------------------------------------------------------------------
                   |               Robust
      logdisclsize | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------------+----------------------------------------------------------------
    1.electionyear |   .0775608   .0513554     1.51   0.138    -.0258123    .1809339
       1.landslide |   .0218505   .0929655     0.24   0.815    -.1652793    .2089803
                   |
      electionyear#|
         landslide |
              0 1  |          0  (empty)
              1 1  |          0  (omitted)
                   |
              size |   .0347288   .0279295     1.24   0.220    -.0214904     .090948
     wins_ExecComp |  -.0763669   .2829463    -0.27   0.788    -.6459082    .4931745
     wins_leverage |  -.0495822   .1025383    -0.48   0.631     -.255981    .1568167
wins_ContribReli~e |  -.0879667   .1880076    -0.47   0.642    -.4664063    .2904729
             _cons |   7.392996   .3775152    19.58   0.000     6.633097    8.152895
------------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
         ein |       234           0         234     |
        year |         9           1           8     |
-----------------------------------------------------+
I understand the interaction is omitted due to collinearity - every landslide is also an electionyear, and only 5.94% of observations are landslides (and 7% for this subset of REPR organizations). However, all electionyears are not landslides. How can I interpret these main effects given that I ideally would have wanted the coefficient on the interaction term?

Moreover, does anybody have advice on presenting this table in a paper or even testing it differently, as I am told that academically it is best practice to run the regression with an interaction rather than just the main effects, but I've never seen a table presented with the interaction omitted as it is here? Thank you so much!

Using GEONEAR to match rainfall data to coordinates

$
0
0
I have daily rainfall data from Indian Meteorological department from 2000 to 2020. The data reports rainfall, latitude, longitude and date (day, month and year). The latitude ranges from 6.5 to 38.5 and longitude ranges from 66.5 to 100.
I have also downloaded and converted shapefiles for Indian districts. In this data the X-coordinates are 69.77805 to 96.82878 and the Y-coordinates are 7.516872 to 35.53081
My objective is to match the rainfall data to Indian districts using the ‘geonear’ command.

My problem is that when I run the geonear command I get an error. I run the below command

Code:
geonear _ID _CX _CY using RF, neighbors(id LAT LON ) nearcount(3)
base latitude var _CX must be between -90 and 90
where _ID is the unique identifier, _CX is the the X-coordinates and _CY is the the y-coordinates in the shapefile. And id LAT and LON are the unique identified and latitude and longitude in my rainfall data file (RF.dta)

Below is an example of my rainfall data

​​​​​​​
Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input int day float(month Year ddate mdate _CY _CX rainfall)
1 1 2015 20089 660  6.5  66.5 .
1 1 2015 20089 660  6.5 66.75 .
1 1 2015 20089 660  6.5    67 .
1 1 2015 20089 660  6.5 67.25 .
1 1 2015 20089 660  6.5  67.5 .
1 1 2015 20089 660  6.5 67.75 .
1 1 2015 20089 660  6.5    68 .
1 1 2015 20089 660  6.5 68.25 .
1 1 2015 20089 660  6.5  68.5 .
1 1 2015 20089 660  6.5 68.75 .
1 1 2015 20089 660  6.5    69 .
1 1 2015 20089 660  6.5 69.25 .
1 1 2015 20089 660  6.5  69.5 .
1 1 2015 20089 660  6.5 69.75 .
1 1 2015 20089 660  6.5    70 .
1 1 2015 20089 660  6.5 70.25 .
1 1 2015 20089 660  6.5  70.5 .
1 1 2015 20089 660  6.5 70.75 .
1 1 2015 20089 660  6.5    71 .
1 1 2015 20089 660  6.5 71.25 .
1 1 2015 20089 660  6.5  71.5 .
1 1 2015 20089 660  6.5 71.75 .
1 1 2015 20089 660  6.5    72 .
1 1 2015 20089 660  6.5 72.25 .
1 1 2015 20089 660  6.5  72.5 .
1 1 2015 20089 660  6.5 72.75 .
1 1 2015 20089 660  6.5    73 .
1 1 2015 20089 660  6.5 73.25 .
1 1 2015 20089 660  6.5  73.5 .
1 1 2015 20089 660  6.5 73.75 .
1 1 2015 20089 660  6.5    74 .
1 1 2015 20089 660  6.5 74.25 .
1 1 2015 20089 660  6.5  74.5 .
1 1 2015 20089 660  6.5 74.75 .
1 1 2015 20089 660  6.5    75 .
1 1 2015 20089 660  6.5 75.25 .
1 1 2015 20089 660  6.5  75.5 .
1 1 2015 20089 660  6.5 75.75 .
1 1 2015 20089 660  6.5    76 .
1 1 2015 20089 660  6.5 76.25 .
1 1 2015 20089 660  6.5  76.5 .
1 1 2015 20089 660  6.5 76.75 .
1 1 2015 20089 660  6.5    77 .
1 1 2015 20089 660  6.5 77.25 .
1 1 2015 20089 660  6.5  77.5 .
1 1 2015 20089 660  6.5 77.75 .
1 1 2015 20089 660  6.5    78 .
1 1 2015 20089 660  6.5 78.25 .
1 1 2015 20089 660  6.5  78.5 .
1 1 2015 20089 660  6.5 78.75 .
1 1 2015 20089 660  6.5    79 .
1 1 2015 20089 660  6.5 79.25 .
1 1 2015 20089 660  6.5  79.5 .
1 1 2015 20089 660  6.5 79.75 .
1 1 2015 20089 660  6.5    80 .
1 1 2015 20089 660  6.5 80.25 .
1 1 2015 20089 660  6.5  80.5 .
1 1 2015 20089 660  6.5 80.75 .
1 1 2015 20089 660  6.5    81 .
1 1 2015 20089 660  6.5 81.25 .
1 1 2015 20089 660  6.5  81.5 .
1 1 2015 20089 660  6.5 81.75 .
1 1 2015 20089 660  6.5    82 .
1 1 2015 20089 660  6.5 82.25 .
1 1 2015 20089 660  6.5  82.5 .
1 1 2015 20089 660  6.5 82.75 .
1 1 2015 20089 660  6.5    83 .
1 1 2015 20089 660  6.5 83.25 .
1 1 2015 20089 660  6.5  83.5 .
1 1 2015 20089 660  6.5 83.75 .
1 1 2015 20089 660  6.5    84 .
1 1 2015 20089 660  6.5 84.25 .
1 1 2015 20089 660  6.5  84.5 .
1 1 2015 20089 660  6.5 84.75 .
1 1 2015 20089 660  6.5    85 .
1 1 2015 20089 660  6.5 85.25 .
1 1 2015 20089 660  6.5  85.5 .
1 1 2015 20089 660  6.5 85.75 .
1 1 2015 20089 660  6.5    86 .
1 1 2015 20089 660  6.5 86.25 .
1 1 2015 20089 660  6.5  86.5 .
1 1 2015 20089 660  6.5 86.75 .
1 1 2015 20089 660  6.5    87 .
1 1 2015 20089 660  6.5 87.25 .
1 1 2015 20089 660  6.5  87.5 .
1 1 2015 20089 660  6.5 87.75 .
1 1 2015 20089 660  6.5    88 .
1 1 2015 20089 660  6.5 88.25 .
1 1 2015 20089 660  6.5  88.5 .
1 1 2015 20089 660  6.5 88.75 .
1 1 2015 20089 660  6.5    89 .
1 1 2015 20089 660  6.5 89.25 .
1 1 2015 20089 660  6.5  89.5 .
1 1 2015 20089 660  6.5 89.75 .
1 1 2015 20089 660  6.5    90 .
1 1 2015 20089 660  6.5 90.25 .
1 1 2015 20089 660  6.5  90.5 .
1 1 2015 20089 660  6.5 90.75 .
1 1 2015 20089 660  6.5    91 .
1 1 2015 20089 660  6.5 91.25 .
1 1 2015 20089 660  6.5  91.5 .
1 1 2015 20089 660  6.5 91.75 .
1 1 2015 20089 660  6.5    92 .
1 1 2015 20089 660  6.5 92.25 .
1 1 2015 20089 660  6.5  92.5 .
1 1 2015 20089 660  6.5 92.75 .
1 1 2015 20089 660  6.5    93 .
1 1 2015 20089 660  6.5 93.25 .
1 1 2015 20089 660  6.5  93.5 .
1 1 2015 20089 660  6.5 93.75 .
1 1 2015 20089 660  6.5    94 .
1 1 2015 20089 660  6.5 94.25 .
1 1 2015 20089 660  6.5  94.5 .
1 1 2015 20089 660  6.5 94.75 .
1 1 2015 20089 660  6.5    95 .
1 1 2015 20089 660  6.5 95.25 .
1 1 2015 20089 660  6.5  95.5 .
1 1 2015 20089 660  6.5 95.75 .
1 1 2015 20089 660  6.5    96 .
1 1 2015 20089 660  6.5 96.25 .
1 1 2015 20089 660  6.5  96.5 .
1 1 2015 20089 660  6.5 96.75 .
1 1 2015 20089 660  6.5    97 .
1 1 2015 20089 660  6.5 97.25 .
1 1 2015 20089 660  6.5  97.5 .
1 1 2015 20089 660  6.5 97.75 .
1 1 2015 20089 660  6.5    98 .
1 1 2015 20089 660  6.5 98.25 .
1 1 2015 20089 660  6.5  98.5 .
1 1 2015 20089 660  6.5 98.75 .
1 1 2015 20089 660  6.5    99 .
1 1 2015 20089 660  6.5 99.25 .
1 1 2015 20089 660  6.5  99.5 .
1 1 2015 20089 660  6.5 99.75 .
1 1 2015 20089 660  6.5   100 .
1 1 2015 20089 660 6.75  66.5 .
1 1 2015 20089 660 6.75 66.75 .
1 1 2015 20089 660 6.75    67 .
1 1 2015 20089 660 6.75 67.25 .
1 1 2015 20089 660 6.75  67.5 .
1 1 2015 20089 660 6.75 67.75 .
1 1 2015 20089 660 6.75    68 .
1 1 2015 20089 660 6.75 68.25 .
1 1 2015 20089 660 6.75  68.5 .
1 1 2015 20089 660 6.75 68.75 .
1 1 2015 20089 660 6.75    69 .
1 1 2015 20089 660 6.75 69.25 .
1 1 2015 20089 660 6.75  69.5 .
1 1 2015 20089 660 6.75 69.75 .
1 1 2015 20089 660 6.75    70 .
1 1 2015 20089 660 6.75 70.25 .
1 1 2015 20089 660 6.75  70.5 .
1 1 2015 20089 660 6.75 70.75 .
1 1 2015 20089 660 6.75    71 .
1 1 2015 20089 660 6.75 71.25 .
1 1 2015 20089 660 6.75  71.5 .
1 1 2015 20089 660 6.75 71.75 .
1 1 2015 20089 660 6.75    72 .
1 1 2015 20089 660 6.75 72.25 .
1 1 2015 20089 660 6.75  72.5 .
1 1 2015 20089 660 6.75 72.75 .
1 1 2015 20089 660 6.75    73 .
1 1 2015 20089 660 6.75 73.25 .
1 1 2015 20089 660 6.75  73.5 .
1 1 2015 20089 660 6.75 73.75 .
1 1 2015 20089 660 6.75    74 .
1 1 2015 20089 660 6.75 74.25 .
1 1 2015 20089 660 6.75  74.5 .
1 1 2015 20089 660 6.75 74.75 .
1 1 2015 20089 660 6.75    75 .
1 1 2015 20089 660 6.75 75.25 .
1 1 2015 20089 660 6.75  75.5 .
1 1 2015 20089 660 6.75 75.75 .
1 1 2015 20089 660 6.75    76 .
1 1 2015 20089 660 6.75 76.25 .
1 1 2015 20089 660 6.75  76.5 .
1 1 2015 20089 660 6.75 76.75 .
1 1 2015 20089 660 6.75    77 .
1 1 2015 20089 660 6.75 77.25 .
1 1 2015 20089 660 6.75  77.5 .
1 1 2015 20089 660 6.75 77.75 .
1 1 2015 20089 660 6.75    78 .
1 1 2015 20089 660 6.75 78.25 .
1 1 2015 20089 660 6.75  78.5 .
1 1 2015 20089 660 6.75 78.75 .
1 1 2015 20089 660 6.75    79 .
1 1 2015 20089 660 6.75 79.25 .
1 1 2015 20089 660 6.75  79.5 .
1 1 2015 20089 660 6.75 79.75 .
1 1 2015 20089 660 6.75    80 .
1 1 2015 20089 660 6.75 80.25 .
1 1 2015 20089 660 6.75  80.5 .
1 1 2015 20089 660 6.75 80.75 .
1 1 2015 20089 660 6.75    81 .
1 1 2015 20089 660 6.75 81.25 .
1 1 2015 20089 660 6.75  81.5 .
1 1 2015 20089 660 6.75 81.75 .
1 1 2015 20089 660 6.75    82 .
1 1 2015 20089 660 6.75 82.25 .
1 1 2015 20089 660 6.75  82.5 .
1 1 2015 20089 660 6.75 82.75 .
1 1 2015 20089 660 6.75    83 .
1 1 2015 20089 660 6.75 83.25 .
1 1 2015 20089 660 6.75  83.5 .
1 1 2015 20089 660 6.75 83.75 .
1 1 2015 20089 660 6.75    84 .
1 1 2015 20089 660 6.75 84.25 .
1 1 2015 20089 660 6.75  84.5 .
1 1 2015 20089 660 6.75 84.75 .
1 1 2015 20089 660 6.75    85 .
1 1 2015 20089 660 6.75 85.25 .
1 1 2015 20089 660 6.75  85.5 .
1 1 2015 20089 660 6.75 85.75 .
1 1 2015 20089 660 6.75    86 .
1 1 2015 20089 660 6.75 86.25 .
1 1 2015 20089 660 6.75  86.5 .
1 1 2015 20089 660 6.75 86.75 .
1 1 2015 20089 660 6.75    87 .
1 1 2015 20089 660 6.75 87.25 .
1 1 2015 20089 660 6.75  87.5 .
1 1 2015 20089 660 6.75 87.75 .
1 1 2015 20089 660 6.75    88 .
1 1 2015 20089 660 6.75 88.25 .
1 1 2015 20089 660 6.75  88.5 .
1 1 2015 20089 660 6.75 88.75 .
1 1 2015 20089 660 6.75    89 .
1 1 2015 20089 660 6.75 89.25 .
1 1 2015 20089 660 6.75  89.5 .
1 1 2015 20089 660 6.75 89.75 .
1 1 2015 20089 660 6.75    90 .
1 1 2015 20089 660 6.75 90.25 .
1 1 2015 20089 660 6.75  90.5 .
1 1 2015 20089 660 6.75 90.75 .
1 1 2015 20089 660 6.75    91 .
1 1 2015 20089 660 6.75 91.25 .
1 1 2015 20089 660 6.75  91.5 .
1 1 2015 20089 660 6.75 91.75 .
1 1 2015 20089 660 6.75    92 .
1 1 2015 20089 660 6.75 92.25 .
1 1 2015 20089 660 6.75  92.5 .
1 1 2015 20089 660 6.75 92.75 .
1 1 2015 20089 660 6.75    93 .
1 1 2015 20089 660 6.75 93.25 .
1 1 2015 20089 660 6.75  93.5 .
1 1 2015 20089 660 6.75 93.75 .
1 1 2015 20089 660 6.75    94 .
1 1 2015 20089 660 6.75 94.25 .
1 1 2015 20089 660 6.75  94.5 .
1 1 2015 20089 660 6.75 94.75 .
1 1 2015 20089 660 6.75    95 .
1 1 2015 20089 660 6.75 95.25 .
1 1 2015 20089 660 6.75  95.5 .
1 1 2015 20089 660 6.75 95.75 .
1 1 2015 20089 660 6.75    96 .
1 1 2015 20089 660 6.75 96.25 .
1 1 2015 20089 660 6.75  96.5 .
1 1 2015 20089 660 6.75 96.75 .
1 1 2015 20089 660 6.75    97 .
1 1 2015 20089 660 6.75 97.25 .
1 1 2015 20089 660 6.75  97.5 .
1 1 2015 20089 660 6.75 97.75 .
1 1 2015 20089 660 6.75    98 .
1 1 2015 20089 660 6.75 98.25 .
1 1 2015 20089 660 6.75  98.5 .
1 1 2015 20089 660 6.75 98.75 .
1 1 2015 20089 660 6.75    99 .
1 1 2015 20089 660 6.75 99.25 .
1 1 2015 20089 660 6.75  99.5 .
1 1 2015 20089 660 6.75 99.75 .
1 1 2015 20089 660 6.75   100 .
1 1 2015 20089 660    7  66.5 .
1 1 2015 20089 660    7 66.75 .
1 1 2015 20089 660    7    67 .
1 1 2015 20089 660    7 67.25 .
1 1 2015 20089 660    7  67.5 .
1 1 2015 20089 660    7 67.75 .
1 1 2015 20089 660    7    68 .
1 1 2015 20089 660    7 68.25 .
1 1 2015 20089 660    7  68.5 .
1 1 2015 20089 660    7 68.75 .
1 1 2015 20089 660    7    69 .
1 1 2015 20089 660    7 69.25 .
1 1 2015 20089 660    7  69.5 .
1 1 2015 20089 660    7 69.75 .
1 1 2015 20089 660    7    70 .
1 1 2015 20089 660    7 70.25 .
1 1 2015 20089 660    7  70.5 .
1 1 2015 20089 660    7 70.75 .
1 1 2015 20089 660    7    71 .
1 1 2015 20089 660    7 71.25 .
1 1 2015 20089 660    7  71.5 .
1 1 2015 20089 660    7 71.75 .
1 1 2015 20089 660    7    72 .
1 1 2015 20089 660    7 72.25 .
1 1 2015 20089 660    7  72.5 .
1 1 2015 20089 660    7 72.75 .
1 1 2015 20089 660    7    73 .
1 1 2015 20089 660    7 73.25 .
1 1 2015 20089 660    7  73.5 .
1 1 2015 20089 660    7 73.75 .
1 1 2015 20089 660    7    74 .
1 1 2015 20089 660    7 74.25 .
1 1 2015 20089 660    7  74.5 .
1 1 2015 20089 660    7 74.75 .
1 1 2015 20089 660    7    75 .
1 1 2015 20089 660    7 75.25 .
1 1 2015 20089 660    7  75.5 .
1 1 2015 20089 660    7 75.75 .
1 1 2015 20089 660    7    76 .
1 1 2015 20089 660    7 76.25 .
1 1 2015 20089 660    7  76.5 .
1 1 2015 20089 660    7 76.75 .
1 1 2015 20089 660    7    77 .
1 1 2015 20089 660    7 77.25 .
1 1 2015 20089 660    7  77.5 .
1 1 2015 20089 660    7 77.75 .
1 1 2015 20089 660    7    78 .
1 1 2015 20089 660    7 78.25 .
1 1 2015 20089 660    7  78.5 .
1 1 2015 20089 660    7 78.75 .
1 1 2015 20089 660    7    79 .
1 1 2015 20089 660    7 79.25 .
1 1 2015 20089 660    7  79.5 .
1 1 2015 20089 660    7 79.75 .
1 1 2015 20089 660    7    80 .
1 1 2015 20089 660    7 80.25 .
1 1 2015 20089 660    7  80.5 .
1 1 2015 20089 660    7 80.75 .
1 1 2015 20089 660    7    81 .
1 1 2015 20089 660    7 81.25 .
1 1 2015 20089 660    7  81.5 .
1 1 2015 20089 660    7 81.75 .
1 1 2015 20089 660    7    82 .
1 1 2015 20089 660    7 82.25 .
1 1 2015 20089 660    7  82.5 .
1 1 2015 20089 660    7 82.75 .
1 1 2015 20089 660    7    83 .
1 1 2015 20089 660    7 83.25 .
1 1 2015 20089 660    7  83.5 .
1 1 2015 20089 660    7 83.75 .
1 1 2015 20089 660    7    84 .
1 1 2015 20089 660    7 84.25 .
1 1 2015 20089 660    7  84.5 .
1 1 2015 20089 660    7 84.75 .
1 1 2015 20089 660    7    85 .
1 1 2015 20089 660    7 85.25 .
1 1 2015 20089 660    7  85.5 .
1 1 2015 20089 660    7 85.75 .
1 1 2015 20089 660    7    86 .
1 1 2015 20089 660    7 86.25 .
1 1 2015 20089 660    7  86.5 .
1 1 2015 20089 660    7 86.75 .
1 1 2015 20089 660    7    87 .
1 1 2015 20089 660    7 87.25 .
1 1 2015 20089 660    7  87.5 .
1 1 2015 20089 660    7 87.75 .
1 1 2015 20089 660    7    88 .
1 1 2015 20089 660    7 88.25 .
1 1 2015 20089 660    7  88.5 .
1 1 2015 20089 660    7 88.75 .
1 1 2015 20089 660    7    89 .
1 1 2015 20089 660    7 89.25 .
1 1 2015 20089 660    7  89.5 .
1 1 2015 20089 660    7 89.75 .
1 1 2015 20089 660    7    90 .
1 1 2015 20089 660    7 90.25 .
1 1 2015 20089 660    7  90.5 .
1 1 2015 20089 660    7 90.75 .
1 1 2015 20089 660    7    91 .
1 1 2015 20089 660    7 91.25 .
1 1 2015 20089 660    7  91.5 .
1 1 2015 20089 660    7 91.75 .
1 1 2015 20089 660    7    92 .
1 1 2015 20089 660    7 92.25 .
1 1 2015 20089 660    7  92.5 .
1 1 2015 20089 660    7 92.75 .
1 1 2015 20089 660    7    93 .
1 1 2015 20089 660    7 93.25 .
1 1 2015 20089 660    7  93.5 .
1 1 2015 20089 660    7 93.75 .
1 1 2015 20089 660    7    94 .
1 1 2015 20089 660    7 94.25 .
1 1 2015 20089 660    7  94.5 .
1 1 2015 20089 660    7 94.75 .
1 1 2015 20089 660    7    95 .
1 1 2015 20089 660    7 95.25 .
1 1 2015 20089 660    7  95.5 .
1 1 2015 20089 660    7 95.75 .
1 1 2015 20089 660    7    96 .
1 1 2015 20089 660    7 96.25 .
1 1 2015 20089 660    7  96.5 .
1 1 2015 20089 660    7 96.75 .
1 1 2015 20089 660    7    97 .
1 1 2015 20089 660    7 97.25 .
1 1 2015 20089 660    7  97.5 .
1 1 2015 20089 660    7 97.75 .
1 1 2015 20089 660    7    98 .
1 1 2015 20089 660    7 98.25 .
1 1 2015 20089 660    7  98.5 .
1 1 2015 20089 660    7 98.75 .
1 1 2015 20089 660    7    99 .
1 1 2015 20089 660    7 99.25 .
1 1 2015 20089 660    7  99.5 .
1 1 2015 20089 660    7 99.75 .
1 1 2015 20089 660    7   100 .
1 1 2015 20089 660 7.25  66.5 .
1 1 2015 20089 660 7.25 66.75 .
1 1 2015 20089 660 7.25    67 .
1 1 2015 20089 660 7.25 67.25 .
1 1 2015 20089 660 7.25  67.5 .
1 1 2015 20089 660 7.25 67.75 .
1 1 2015 20089 660 7.25    68 .
1 1 2015 20089 660 7.25 68.25 .
1 1 2015 20089 660 7.25  68.5 .
1 1 2015 20089 660 7.25 68.75 .
1 1 2015 20089 660 7.25    69 .
1 1 2015 20089 660 7.25 69.25 .
1 1 2015 20089 660 7.25  69.5 .
1 1 2015 20089 660 7.25 69.75 .
1 1 2015 20089 660 7.25    70 .
1 1 2015 20089 660 7.25 70.25 .
1 1 2015 20089 660 7.25  70.5 .
1 1 2015 20089 660 7.25 70.75 .
1 1 2015 20089 660 7.25    71 .
1 1 2015 20089 660 7.25 71.25 .
1 1 2015 20089 660 7.25  71.5 .
1 1 2015 20089 660 7.25 71.75 .
1 1 2015 20089 660 7.25    72 .
1 1 2015 20089 660 7.25 72.25 .
1 1 2015 20089 660 7.25  72.5 .
1 1 2015 20089 660 7.25 72.75 .
1 1 2015 20089 660 7.25    73 .
1 1 2015 20089 660 7.25 73.25 .
1 1 2015 20089 660 7.25  73.5 .
1 1 2015 20089 660 7.25 73.75 .
1 1 2015 20089 660 7.25    74 .
1 1 2015 20089 660 7.25 74.25 .
1 1 2015 20089 660 7.25  74.5 .
1 1 2015 20089 660 7.25 74.75 .
1 1 2015 20089 660 7.25    75 .
1 1 2015 20089 660 7.25 75.25 .
1 1 2015 20089 660 7.25  75.5 .
1 1 2015 20089 660 7.25 75.75 .
1 1 2015 20089 660 7.25    76 .
1 1 2015 20089 660 7.25 76.25 .
1 1 2015 20089 660 7.25  76.5 .
1 1 2015 20089 660 7.25 76.75 .
1 1 2015 20089 660 7.25    77 .
1 1 2015 20089 660 7.25 77.25 .
1 1 2015 20089 660 7.25  77.5 .
1 1 2015 20089 660 7.25 77.75 .
1 1 2015 20089 660 7.25    78 .
1 1 2015 20089 660 7.25 78.25 .
1 1 2015 20089 660 7.25  78.5 .
1 1 2015 20089 660 7.25 78.75 .
1 1 2015 20089 660 7.25    79 .
1 1 2015 20089 660 7.25 79.25 .
1 1 2015 20089 660 7.25  79.5 .
1 1 2015 20089 660 7.25 79.75 .
1 1 2015 20089 660 7.25    80 .
1 1 2015 20089 660 7.25 80.25 .
1 1 2015 20089 660 7.25  80.5 .
1 1 2015 20089 660 7.25 80.75 .
1 1 2015 20089 660 7.25    81 .
1 1 2015 20089 660 7.25 81.25 .
1 1 2015 20089 660 7.25  81.5 .
1 1 2015 20089 660 7.25 81.75 .
1 1 2015 20089 660 7.25    82 .
1 1 2015 20089 660 7.25 82.25 .
1 1 2015 20089 660 7.25  82.5 .
1 1 2015 20089 660 7.25 82.75 .
1 1 2015 20089 660 7.25    83 .
1 1 2015 20089 660 7.25 83.25 .
1 1 2015 20089 660 7.25  83.5 .
1 1 2015 20089 660 7.25 83.75 .
1 1 2015 20089 660 7.25    84 .
1 1 2015 20089 660 7.25 84.25 .
1 1 2015 20089 660 7.25  84.5 .
1 1 2015 20089 660 7.25 84.75 .
1 1 2015 20089 660 7.25    85 .
1 1 2015 20089 660 7.25 85.25 .
1 1 2015 20089 660 7.25  85.5 .
1 1 2015 20089 660 7.25 85.75 .
1 1 2015 20089 660 7.25    86 .
1 1 2015 20089 660 7.25 86.25 .
1 1 2015 20089 660 7.25  86.5 .
1 1 2015 20089 660 7.25 86.75 .
1 1 2015 20089 660 7.25    87 .
1 1 2015 20089 660 7.25 87.25 .
1 1 2015 20089 660 7.25  87.5 .
1 1 2015 20089 660 7.25 87.75 .
1 1 2015 20089 660 7.25    88 .
1 1 2015 20089 660 7.25 88.25 .
1 1 2015 20089 660 7.25  88.5 .
1 1 2015 20089 660 7.25 88.75 .
1 1 2015 20089 660 7.25    89 .
1 1 2015 20089 660 7.25 89.25 .
1 1 2015 20089 660 7.25  89.5 .
1 1 2015 20089 660 7.25 89.75 .
1 1 2015 20089 660 7.25    90 .
end
format %td ddate
format %tm mdate

Staggered DID with no never takers

$
0
0
This is not exactly a Stata question but rather an econometric question on a Stata implementation. Can we use staggered DID in a database that features no never takers (that is, no individuals that remain untreated all along the time observation window)? I have always thought the answer to be yes but since I have run into problems in a real world implementation, I decided to try a simple numerical simulation in Stata and I am somewhat confused by the results. Below is code for replication but let me briefly state what I've done for clarity:
  • I simulate a database with I=100 individuals and T=10 periods, no missing data, all individuals are treated exactly once in a randomly chosen period
  • I try to estimate the effect of the treatment on a (purely random) outcome
  • First I use static TWFE (I am following the terminology in Callaway and Sant'anna 2021, this means regressing the outcome on time effects, individual effects and a treatment dummy), no problem here
  • When I try dynamic TWFE (that is, I regress the outcome on time effects, individual effects, treatment lags and treatment leads, leaving the first lag out for identification), Stata has to drop one time effect due to collinearity. This problem persists after changing the command used (whether reg or reghdfe) or the way variables are defined or ordered in the command, one regressor must always be dropped, implying that collinearity is a feature of the data
  • If I fabricate just one never taker, the problem goes away and dynamic TWFE works just fine
Am I doing something wrong? If not, does this imply that dynamic TWFE does not work in the absence of never takers? Is there a lesson to be learned here or is it just a case of a method that is not prepared to deal with any kind of data?

Code:
***Database generation

clear all
set seed 260586
set obs 100
gen i=_n
expand 10
bys i: gen t=_n

gen r=round(uniform()*8) // random treatment assignment
bys i: gen rr=r if _n==1
replace rr=5 if rr==0
bys i: egen t_treat=mean(rr)
drop r rr
gen treat=t==t_treat

gen treat_dif=t-t_treat // lags and leads
forvalues i=1/9 {
    local j=10-`i'
    gen treat_lag`i'=treat_dif==-`i'
    gen treat_lead`i'=treat_dif==`i'
}

gen y=rnormal(0,1) // random outcome
drop treat_lag8 treat_lag9 // no cases with these many lags

***Estimation

xtset i t
xtreg y treat i.t, fe // traditional (static) TWFE
xtreg y treat_lag7 treat_lag6 treat_lag5 treat_lag4 treat_lag3 treat_lag2 treat treat_lead* i.t, fe // dynamic TWFE

***Now let's throw in one never taker and try again

unab vars: treat_lag1-treat_lead9
foreach var in `vars' {
    replace `var'=0 in 1/10
}

xtreg y treat_lag7 treat_lag6 treat_lag5 treat_lag4 treat_lag3 treat_lag2 treat treat_lead* i.t, fe // dynamic TWFE

Academic acceptability of grouping large values in regression analysis

$
0
0
Hello!

I am new to Stata. I am using GDP per capita (GDPpc) and CO2 variables, which have large values, in my regression. Instead of transforming them into natural logarithms, I generate a unique number for each distinct GDPpc value within each year and country ID group.

In Stata, I use these codes:
Code:
bys year(id): gen lGDPpc = group(GDPpc)
bys year(id): gen lCO2 = group(CO2)
Is this acceptable in academic work? Is this a common practice? I observe different results when I use natural logs compared to these variables.

oneway Bonferroni

$
0
0
Hi,

I'm trying to do a cross-group comparison, where I compare 5 treatment groups (a control group and 4 experimental groups) in terms of a characteristic of the block in which the participants of each group reside (proportion of white residents)

Given that I'm comparing 5 groups, I used a oneway anova with the Bonferroni procedure (syntax and output below).

I'm struggling to understand why the F-test suggests that there is a signficant difference in the mean of the proportion of white residents between the 5 treatment groups, while the p-values of the group-to-group comparison do not suggest that there is a significant difference in any of the pairwise group-to-group comparisons (C and T1, T1 and T2, C and T2, etc.). Given that the F-test suggests that there is a significant difference between groups, I would have expected that to be driven by differences between two pairs of groups or more. Does anyone know how to interpret this or whether I'm misinterpeting the output?

Thank you!

Code:
oneway perc_white treatment, bonferroni

                        Analysis of variance
    Source              SS         df      MS            F     Prob > F
------------------------------------------------------------------------
Between groups      1.16692792      4    .29173198      2.52     0.0399
 Within groups       130.64936   1127   .115926672
------------------------------------------------------------------------
    Total           131.816287   1131   .116548442

Bartlett's equal-variances test: chi2(4) =   2.1741    Prob>chi2 = 0.704

                    Comparison of perc_white by treatment
                                (Bonferroni)
Row Mean-|
Col Mean |          C         T1     T2_AT2     T2_AT3
---------+--------------------------------------------
      T1 |   -.060074
         |      0.687
         |
  T2_AT2 |    .004283    .064357
         |      1.000      0.464
         |
  T2_AT3 |   -.039424     .02065   -.043707
         |      1.000      1.000      1.000
         |
   T3_NA |   -.078697   -.018623    -.08298   -.039273
         |      0.174      1.000      0.105      1.000

Calculation of Economic Significane: A doubt

$
0
0
Dear All,

I hope this message finds you well. I have a question regarding the calculation of economic significance that isn't directly related to Stata, but I believe the members of this group might be able to help clarify it for me.

The doubt pertains to a paper published in Management Science, which you can find here (https://pubsonline.informs.org/doi/pdf/10.1287/mnsc.2021.4055). The dependent variables in the study are innovation metrics, specifically ln(1 + Patent) and ln(1 + Citation), for industry i, country j, and year t. The main explanatory variable is "Trust" in country j, measured in year t − 1.

In the results section, particularly in Table 3, the authors computed economic significance by stating that a one-standard-deviation increase in social trust (0.153) is associated with a 53% increase in the number of patents and a 56% increase in the number of patent citations, relative to their respective sample means. They provided a footnote (No.14) which I find quite confusing.

I typically compute economic significance using the formula: (Coefficient * SD of independent variable) / Mean of the dependent variable. However, the authors seem to have derived a different value altogether.

If anyone could explain their method, it would greatly enhance my understanding of the computation of economic significance.

Thank you in advance for your assistance.

Metandi - combine SROC curves into one graph

$
0
0
Hi all,

I'm trying to compare sensitivity and specificity for two diagnostic tests with SROC curves. I am using the metandi & metandiplot commands. My question is how can I combine the SROC curves for test 1 & test2 into a single graph instead of generating separate graphs for both? These are my commands for test 1 & 2, with failp & curep corresponding to true & false positives respectively and failn & curen corresponding to false & true negatives respectively:

metandi failp3 curep3 failn3 curen3 if test3==3 & test1==1,nolog

metandiplot failp3 curep3 failn3 curen3 if test3==3 & test1==1 [aw=1], predopts(off) graphregion(fcolor(white)) graphregion(margin(none)) graphregion(margin(none)lcolor(white)) legend(off) xtit("Specificity", size(medium)) ytit("Sensitivity", size(medium)) summopts(msymb(S)msize(1.5)mcolor(blue)) confopts(lcolor(blue)) curveopt(lcolor(blue)lwidth(0.4)) studyopts(mlcolor(blue) msym(oh)msize(1.5))

metandi failp1 curep1 failn1 curen1 if test3==3 & test1==1,nolog

metandiplot failp1 curep1 failn1 curen1 if test3==3 & test1==1 [aw=1], predopts(off) graphregion(fcolor(white)) graphregion(margin(none)) graphregion(margin(none)lcolor(white)) legend(off) xtit("Specificity", size(medium)) ytit("Sensitivity", size(medium)) summopts(msymb(S)msize(1.5)mcolor(red)) confopts(lcolor(red)) curveopt(lcolor(red)lwidth(0.4)) studyopts(mlcolor(red) msym(oh)msize(1.5))

Thanks in advance!

Sharp RDD with a small sample

$
0
0
Hi,
I run Sharp RDD with a small sample (600 observations in total). My running variable is year, and there are a few cutoffs, each cutoff for a different year, since the treatment took place in different years.
That leaves me with very few observations for some years.
My results:
cutoff at 2015: 129 obs above the cutoff, 106 obs below the cutoff, coefficient -24.8, PV 0.014.
cutoff at 2016: 72 obs above the cutoff, 68 obs below the cutoff, coefficient -17.03, PV 0.026.
cutoff at 2017: currently not enough observations to produce coefficients.
cutoff at 2018: 24 obs above the cutoff, 30 obs below the cutoff, coefficient -18.6, PV 0.332.
cutoff at 2019: 15 obs above the cutoff, 27 obs below the cutoff, coefficient -51.8, PV 0.22.
cutoff at 2020: 20 obs above cutoff, 46 obs below the cutoff, coefficient -26.53, PV 0.115.

It's obvious that I get PV > 0.05 in years with low number of observations. What can be done here?
Can I draw conclusions in years 2015 and 2016?
Here are the plots for my year cutoffs: Array
Array
Array
Array
Array

Replicating tabulate intersection with analytic weights

$
0
0
Dear Statalisters,

I am trying to estimate the intersect of mothers an low educated individuals without using the 'tabulate' command in a dataset with weights in Stata v. 16.1.
The two categories are represented by dummy variables in the dataset (mother and low_educ, respectively).
If I were to use 'tabulate', I'd write something like this:

Code:
 tab mother low_educ [aw = weight]
I ran this code for reference and the intersect (where both dummies == 1) has 742.41172 weighted observations.

I read the section of the manual on analytic weights and got close to some solution but I think I am missing something regarding how Stata handles weights.

My attempt:
Code:
// Rescale weights to sum to N
egen total_weight = sum(weight)
gen num_obs = _N
gen weight_2 = weight / total_weight * num_obs

// Apply weights
gen mother_w2 = mother*weight_2
gen low_educ_w2 = low_educ*weight_2

// Create table with intersect
mkmat mother_w2 low_educ_w2, matrix(D)
mat define DD= D' *D
mat list DD
I get 1087.4816 instead of 742.41172 as the number of weighted observations in the intersect.

What should I write differently to get the same result?

Thank you for your help in advance.

Balu

LIWC-22 Output with Zero Overdispersion

$
0
0
Dear all,

I have analyzed news articles with the text analysis tool LIWC-22. LIWC analyzes the frequency of words in various categories such as emotional expressions and cognitive processes and the values are given in percentage of the total word count (i.e. a value of 7.5 in the category "moral" means that 7.5% of all words in the text are associated with moral discussions). So the values for any variable can be between 0 to 100. My dataset does not have a panel structure.
My research aim is to find out how categories like AI impact categories like moral or negative emotions, etc.

My problem is that for some variables like moral I have to many zero values, so that normal transformation methods like log or sqrt don't work for me. See an overview below:

Code:
summarize moral

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       moral |     22,339    .2066476     .402975          0       7.09


. tabulate moral if moral <=0.1

      Moral |
 Legitimacy |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |     11,854       93.61       93.61
        .02 |          8        0.06       93.67
        .03 |         28        0.22       93.90
        .04 |         49        0.39       94.28
        .05 |         67        0.53       94.81
        .06 |         90        0.71       95.52
        .07 |        143        1.13       96.65
        .08 |        185        1.46       98.11
        .09 |        239        1.89      100.00
------------+-----------------------------------
      Total |     12,663      100.00
**Questions:**
1. Which transformation methods would be best suited for my dependent variable "moral" to perform an OLS regression?
2. Are there alternative modeling techniques that you would recommend for such a distribution?

Thank you very much for your help!

Using tddens with Survey Weights

$
0
0
Hello,

I am currently working with the tddens command to generate a two-dimensional density plot for two variables, m_ac_npi and wage_h. My dataset involves survey data, and I would like to incorporate survey weights into the density plot. I have a couple of questions regarding the capabilities and usage of tddens:
  1. Is there a way to apply survey weights when using the tddens command? I am trying to ensure that the density plot accurately reflects the weighted distribution of my sample.
  2. I am also interested in overlaying lines on the density plot to indicate the median values of each variable. I attempted to use twoway plot commands with tddens but didn't succeed. Is there an integrated approach within tddens or a compatible method to add these lines directly?


    Thank you in advance for your help!

Problem/bug with the new absorb() option in StataNow for regress: incorrect scores -&gt; suest invalid

$
0
0
I very much welcome the new absorb() option for the regress command introduced in Stata 18.5 (StataNow). However, this now creates problems further down the assembly line.

The regress postestimation help file states the following description for the scores option of the predict command:
score is equivalent to residuals in linear regression.
However, this is no longer correct when variables have been absorbed. The scores produced here are incorrect. As a consequence, subsequent commands that require those scores will produce incorrect results as well. First and foremost, this is an issue for the suest command; see the following example:
Code:
. webuse psidextract

. quietly regress lwage wks, absorb(id)

. estimates store reg

. suest reg, vce(cluster id)

Cluster adjusted results for reg                         Number of obs = 4,165

                                   (Std. err. adjusted for 595 clusters in id)
------------------------------------------------------------------------------
             |               Robust
             | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
mean         |
         wks |   .0010085   .0041811     0.24   0.809    -.0071864    .0092033
       _cons |   6.629139   .1973423    33.59   0.000     6.242355    7.015923
-------------+----------------------------------------------------------------
lnvar        |
       _cons |  -2.696966   .1758439   -15.34   0.000    -3.041613   -2.352318
------------------------------------------------------------------------------

. regress lwage wks, absorb(id) vce(cluster id)

Linear regression, absorbing indicators         Number of obs     =      4,165
                                                F(0, 594)         =          .
                                                Prob > F          =          .
                                                R-squared         =     0.7287
                                                Adj R-squared     =     0.6835
                                                Root MSE          =     .25963

                                   (Std. err. adjusted for 595 clusters in id)
------------------------------------------------------------------------------
             |               Robust
       lwage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         wks |   .0010085   .0014418     0.70   0.485    -.0018231      .00384
       _cons |   6.629139   .0674909    98.22   0.000     6.496589    6.761689
------------------------------------------------------------------------------
The robust standard errors obtained from suest are now very different (and wrong) compared to the correct robust standard errors from regress. Without the absorb() option, the are virtually identical (aside from different degrees-of-freedom corrections).

Ideally, this should be fixed by computing the correct scores, which are the residuals after absorbing the respective variables. Currently, they are computed as y-xb, ignoring the absorbed variables.

xtmixed random slope Model

$
0
0
Hello everyone,

I want to calculate a random effects model that measures the variance of slopes between countries (the sample contains 29 countries) in clusters of countries (4 clusters in total). The idea behind the model is to measure the effect of different values (10 values by Shalom Schwartz) on trust in institutions for each country in one specific cluster.

The relevant variables are:
  • "trust" (coded 1-30)
  • "cluster" (coded 1-4), representing the 4 clusters the countries are sorted into
  • "cid" (the country ID, 29 countries)
The 10 values:
  • security
  • hedonism
  • benevolence
  • power
  • achievement
  • conformity
  • tradition
  • self-direction
  • stimulation
  • universalism
These variables also exist as aggregated country mean variables:
  • cnt_se_mean
  • cnt_he_mean
  • cnt_be_mean
  • cnt_po_mean
  • cnt_ach_mean
  • cnt_co_mean
  • cnt_tr_mean
  • cnt_sd_mean
  • cnt_st_mean
  • cnt_un_mean
When I want to calculate a random slope but fixed intercept model, I need to fix the intercepts but let the slopes remain random. For this, I want to use an xtmixed model.

Do I need the aggregated values for this? If not, the model for Security in Cluster 4, not taking the fixed intercepts into account, would look like this, correct?

Code:

xtmixed trust || cid: security if cluster == 4, cov(uns)


Then, when I want to include the fixed intercepts, I need to fix them in the model. As far as I understand it, I need to include them before the "||". So it would look like this:

Code:

xtmixed trust security || cid: security if cluster == 4, cov(uns)


However, the values seem a bit high, and I am not sure if I did it the right way. Does it make sense to include the same variable in the fixed and the random part of the xtmixed model?

Any help with my model is highly appreciated!


Thank you very much for your help!

Best regards,
Philipp Weimer

Probability weights in random effects models for logistic regression in month-year panel

$
0
0
Probability weights in random effects models for logistic regression in month-year panel

Hello,

I am attempting to estimate the effects of a treatment on my outcome using random effects and xtlogit. Both the treatment and outcome are binary.

For context, before I get to the question: The original dataset is a month-year panel (01/01/2010 - 31/12/2016) at the individual level, with fixed socioeconomic covariates. There are ~3 million individuals and ~85 million observations.

I have ran the following regression on the panel data:

Code:
 
xtlogit phc_urb_vio_risk_ever vio_ever_exp i.sex i.race_skinc i.age_group i.education_highest i.hh_income i.insurance i.bf i.year i.month re vce(robust) or
However, as only ~2000 of the individuals have been treated, I would like to run the same regression on a matched sample where each treated individual is matched to 50 untreated controls.


I have started with PSM using psmatch2:

Code:
 
* (1) estimate propensity score and match

psmatch2 vio_ever_exp (i.sex i.race_skinc i.age_group i.education_highest i.hh_income i.insurance i.bf), neighbor(50) caliper(0.05) common logit

* (2) evaluate match graphically 

psgraph

* (3) evaluate match with statistical tests

pstest i.sex i.race_skinc i.age_group i.education_highest i.hh_income i.insurance i.bf
While I understand the output, I would like to run the regression on all the treated individuals and each of their respective 50 controls. i.e. I need to create a sample of only the matched treated and untreated individuals to then merge on their unique identifiers back into the original panel dataset to run the regression on just the matched (panel) sample. The only way I can see this working is keeping only the relevant unique identifiers but I cannot figure out how to do this.

I previously followed this: https://www.ssc.wisc.edu/sscc/pubs/stata_psmatch.htm and tried some trouble shooting approaches - below. I am attempting to integrate probability weights to run my regression on only the matched sample following propensity score matching. However, I understand that xtreg does not support pweights or aweights. Nevertheless, this still does not resolve the problem of using only the matched sample and identifying on the relevant matched individuals to keep in the panel dataset to run the regression on only the 2000 treated and each of their 50 matches.

Code:
 
* running regression on matched sample TEST

xtlogit phc_urb_vio_risk_ever vio_ever_exp i.sex i.race_skinc i.age_group i.education_highest i.hh_income i.insurance i.bf i.year i.month [fweight=_weight], re vce(robust) or


// error: may not use noninteger frequency weights r(401);
* xtlogit does not accept non-integer frequency weights (fweight)
* instead use probability weights (pweight) or analytic weights (aweight)


xtlogit phc_urb_vio_risk_ever vio_ever_exp i.sex i.race_skinc i.age_group i.education_highest i.hh_income i.insurance i.bf i.year i.month [pweight=_weight], re vce(robust) or


// error: pweight not allowed for fixed- and random-effects cases r(101);
* xtlogit does not support pweight when using random-eff or fixed-eff models
* alt approach is to use melogit, which allows weights and random eff

melogit phc_urb_vio_risk_ever vio_ever_exp i.sex i.race_skinc i.age_group i.education_highest i.hh_income i.insurance i.bf i.year i.month [pweight=_weight] || id_linha_fa:, vce(robust) or
To summarise, my question is, how do I get just the matched sample after psmatch2 to then merge them back into my panel dataset?

For reference, I use Stata 18.

Thank you for taking the time to read this. I appreciate any help and happy to answer any questions.

Bw, Sophia
Viewing all 65631 articles
Browse latest View live


Latest Images