Quantcast
Channel: Statalist
Viewing all 65136 articles
Browse latest View live

How to test the difference of estimated coefficient from two models

$
0
0
Hello, I am trying to run two interaction models and test the two estimated coefficients differences.

For example, my model 1 is xtreg DV (controls) L1.c.(iv)##L1.c.(moderator). My model 2 is xtreg DV (controls) L1.c.(iv)##L3.c.(moderator) trying to explore whether the 3-year lagged value of the moderator has a stronger moderation effect, how should I do it to test it?

If I shouldn't do this way, please kindly advise what I should do instead. Thank you very much!

Ken

Metan

$
0
0
Hello,

I am starting to do a meta-analysis and I use the following syntax: "metan b se, eform" where b is the logarithm of the OR and SE is the standard deviation.

Then I get a table with the 8 articles included, but it doesn't give me the heterogeneity value and I get the following error: matrix operators that return matrices not allowed in this context
Error in metan_output.DrawTableAD

I am using Stata version 15. I have tried it on my computer with several colleagues and only one of them has no problems, but I would like it to work on my computer. Could it be a missing package?

I appreciate any help
Array

Pseudo R2 .z

$
0
0
Hi please I'm running simple regression with industry and year fixed effect , actually I'm getting Pseudo R2 z , im not sure how can i display Pseudo R2

asdoc xtreg lag_ROA hard_final_Exact_new csopresence1 FreezeXCSO Firm_Size_w ROA_w Leverage_w Market_book_four_w Non_pension_CFO_w STD_CFO_w Board_Independence_w BoardSize_w Gender_Diversity_w Fund_Status_w FUNDING_RATIO_w Platn_Size_w CSR_Committee SustainabilityScore_w i.year i.ff_12 , robust cluster (id) nest replace drop(i.year i.ff_12 ) dec(4) save(qqqq)


please can you advice me how to display Pseudo R

csdid to replicate Callaway and Sant'Anna (2021)

$
0
0
Dear all
I am trying to replicate the well-known paper by Callaway and Sant'Anna (2021) Difference-in-Differences with multiple time periods (https://doi.org/10.1016/j.jeconom.2020.12.001). Rather surprisingly, I haven't been able to find any do-file that allows for this in Stata, not even in the authors' personal pages. I had a go at it using the very useful Fernando Ríos-Avila's materials, specifically Playing with Stata (friosavila.github.io). The code that attempts to replicate Table 3 in the paper (arguably the main table) is copied below. Note that the original data can be found at
https://github.com/pedrohcgs/CS_RR
where it is stored as an "rds" file. I am attaching the converted CSV version of such a file (dropping some variables to make it uploadable in the Statalist forum), which is, in turn, used in the code below.

Code:
import delimited "min_wage_CS_reduced.csv", clear case(lower)

/*
treat is treatment qualifier: 1 if treat at any point, 0 o/w
countyreal is a decode of county_name in the original data
*/
rename firsttreat first_treat
gen post_treatm      =inlist(year, 2004, 2005, 2006, 2007)
gen w                =post_treatm*treat
egen region_year=group(region year)

sort countyreal year
xtset countyreal year, yearly

*Table 3
//Panel A
///Row 1: TWFE
xtreg lemp w i.region_year, fe vce(cluster countyreal)

preserve
csdid lemp, ivar(countyreal) time(year) gvar(first_treat)  ///
agg(event) saverif(results_unconditional) replace
estat pretrend

use results_unconditional, clear
///Row 2
csdid_stats simple

///Row 3: Group-specific effects
csdid_stats group

///Row 4: Event Study
csdid_stats event

///Row 5: Calendar time effects
csdid_stats calendar

///Row 6: Event study e=0 e=1 w/ Balanced groups
*?
restore

//Panel B
///Row 1: TWFE
local controls i.region c.white c.hs c.pov c.pop##c.pop c.medinc##c.medinc
xtreg lemp w i.region_year (`controls')##i.year, fe vce(cluster countyreal)

preserve
csdid lemp i.region white hs pov c.pop##c.pop c.medinc##c.medinc, ivar(countyreal) time(year)  gvar(first_treat) method(drimp) ///
agg(event) saverif(results_conditional) replace 
estat pretrend

use results_conditional, clear
///Row 2
csdid_stats simple

///Row 3: Group-specific effects
csdid_stats group

///Row 4: Event Study
csdid_stats event

///Row 5: Calendar time effects
csdid_stats calendar

///Row 6: Event study e=0 e=1 w/ Balanced groups
*?
restore
Panel A in Table 3 is mostly replicated: csdid without any controls allows me to replicate rows 2, 3, 4, and 5 in the paper, where the TEs are aggregated in different ways. Fine. But still, 2 questions remain
i) Where does the coefficient in row 1 in the paper, TWFE, come from? The paper says, "... we first estimate the coefficient on a post-treatment dummy variable in a model with unit fixed effects and region-year fixed effects...". The command above (under "Row 1") results in 0.0177 but the one in the paper is −0.037. Any idea what is the correct specification?
ii) Does csdid allow us to obtain the last row (Row 6: Event study w/ Balanced groups) automatically? Of course, this can be done manually, but I am wondering if this has been automatized

Panel B is somewhat replicated: rows 2 and 3 are, but the rest are not. Of course, this boils down to the model that I have interpreted from the paper, using variables from Table 2. Importantly, the paper says "... We use the doubly robust estimation procedure discussed above. [...] For each generalized propensity score, we estimate a logit model that includes each county characteristic along with quadratic terms for population and median income. For the outcome regressions, we use the same specification for the covariates".
i) My understanding is that, typically, doubly robust methods allow to specify separately an outcome model and a treatment model (see e.g. teffects aipw). But csdid does not allow such decoupling: the model is the same for both. This, in turn, does not allow following what is declared in the original paper, where 2 different models are defined. Why this decoupling is not allowed in this case? Is this what is driving the divergent results? I checked drdid, and it does not allow such decoupling either. Hence, how can the specification implicitly declared in the paper be achieved?
ii) What is the specification to obtain row 1 TWFE in this case with controls? I get 0.0165 but the paper reports −0.008

Any insight into this will be greatly appreciated, and hopefully, it will also help those who are trying to replicate the paper!

Many thanks in advance
JM


I am using Stata 17.0

ps: if the attachment does not work, you can open R and run this bit of code after you download the data in https://github.com/pedrohcgs/CS_RR

ls()
rm(list = ls())
getwd()
setwd('PERSONALFOLDER')
min_wage <- readRDS('min_wage_CS.rds')
write.csv(as.matrix(min_wage),file="min_wage_CS.cs v")

the file uploaded here, min_wage_CS_reduced, drops unnecessary variables from the original dataset

returnweights option for sdid

$
0
0
Hi all
In the paper "On Synthetic Difference-in-Differences and Related Estimation Methods in Stata" section 4.5, Other Parameters, the authors mentioned that to obtain weights we can use returnweights option; however, it seem that this option is not available anymore. As a result, I am trying to get those using e(lambda)[1..time,1] and e(omgea)[1..Nc]. To generate the weight I am using following code:
Code:
matrix lambda = e(lambda)[1..12,1]
matrix omega = e(omega)[1..5620,1]
g weight = lambda * omega
I get this error: matrix operators that return matrices not allowed in this context.
I really appreciate it if you please help me to get those weights as mentioned in the paper

Text Matching Pattern

$
0
0
Dear Stata Experts,

I have lab data, which consists of two variables (test name and test result), and the types of the two variables are string and text, and they are separated by a comma (,). Also, they are randomly written with no specific sequences, such as for test name (al khurma pcr,chikungunya pcr,dengue igg,dengue igm,dengue ns1,dengue pcr,rift valley fever PCR), and for test results(not required,not required,not done,positive,not done,detected,not required)

I am interested in three test names (dengue igm, dengue ns1, dengue PCR) and their test results if they are positive or detected.

Therefore I used the below Command:

replace testresult = lower(testresult ) // convert everything to lowercaes to be safe
replace testname = lower(testname)


gen check = ustrregexm(testresult, "positive| detected")
gen new_test = ustrregexm(testname , "pcr| ns1 |igm")
******************
gen text = ustrregexs(0) if ustrregexm(testname, "dengue igm| dengue ns1|dengue pcr")

the problem is I can't locate each test name (dengue igm| dengue ns1|dengue PCR) with their test results because they are randomly located in the text sequence in the variable test name.


I hope I have explained my issue very clearly, for your assistance, please!.

Meshal





Problem with creating a table showing gender-wise Worker Population Ratio and Unemployment Rates for rural and urban sectors.

$
0
0
Hello folks! I am working with a dataset that looks something like the following:

Code:
* Example generated by -dataex-. For more info, type help    dataex
clear
input float sex byte sector float(emp unemp labforce)
1 1 1 0 1
1 1 1 0 1
2 1 1 0 1
2 1 1 0 1
1 1 1 0 1
1 1 1 0 1
1 1 1 0 1
2 1 1 0 1
1 1 1 0 1
1 1 0 0 0
end
label values sex sex
label def sex 1 "Male", modify
label def sex 2 "Female", modify
label values sector sector
label def sector 1 "Rural", modify
So basically I have five observations (sex, sector, emp, unemp, labforce).

I want to create a table that looks something like the following: Array


I am not able to figure out how I calculate WPR (Worker Population Ratio) and UR (Unemployment Rate) in Stata and tabulate that on gender and sector variables.

Note:
WPR = No. of Employed (emp) / Working age population
UR = No. of Unemployed (unemp) / (No. of Employed + No. of Unemployed)

Since I am new to Stata, I will appreciate a lot if you could add brief explanatory comments to your codes. Thanks!

Equivalence of Specifications in Triple Difference Estimation

$
0
0
Dear all,

Suppose I run the following saturated triple difference equation:

Code:
reg Y post treated_industry treated_state treated_industry#treated_state treated_industry#post treated_state#post treated_industry#treated_state#post
the triple difference coefficient we want will be that of treated_industry#treated_state#post.

Suppose the outcome thus varies on a state, year and industry basis. We therefore run the following specification, also saturated:

Code:
reg Y treated_industry#treated_state#post i.year#i.state i.year#i.industry i.industry#i.state

Is it normal to obtain the identical coefficient on the triple difference coefficient treated_industry#treated_state#post across both estimations?

Might this be the case because both models are saturated, and therefore the variation of the triple difference variable must be identical within clusters defined in the abovementioned estimations?

Workshop on Generative AI and Machine Learning for Decision Making

$
0
0
Hi Everyone
I am running a 2 day Workshop:

Generative AI and Machine Learning for Decision Making:
Prediction, Classification and Causal Effects


Sidney Sussex College, University of Cambridge
September 26-27 2024

Please contact me for more information: mw217@econ.cam.ac.uk

Thanks
Melvyn

-ovbd- updated on SSC

$
0
0
Thanks to Kit Baum, a new version of the user-written command ovbd has been put up on SSC. ovbd creates a sample of random binomial data with a user-specified correlation structure and pattern of proportions. Its primary utility is in simulation exercises.

This is a new version of the command, which was originally posted on SSC in 2007 and was written for Stata Release 9, when Stata employed the older pseudorandom number generator. The new version is written for the current Release 18 of Stata.

There are several differences from the former version in its use.

First, ovbd no longer has a dependency on the user-written command ridder. Neither does the new version call drawnorm. Both of these functionalities are now internalized in the command’s code.

The clear option is no longer mandatory.

Likewise, the verbose option behaves differently from that of the first version in that details of failures to find suitable roots are always reported. ovbd now reports the involved proportions and their sought correlation coefficient for each pair where root finding fails along with the reason for the algorithm’s failure (essentially exclusively bracketing failures). The purpose of the verbose option now is to display an informational message whenever the transformed correlation matrix is not positive definite so that the user may take the indicated precaution.

Allowed syntax (values and combinations) for varlist, n() and clear follows that allowed by drawnorm.

Last, the two component functionalities—transformation of the correlation matrix and means vector, and generation of the correlated random binomial variables from them—are now contained in separate subcommands, ovbdc and ovbdr respectively, which may be called by the user directly. This serves to increase efficiency in the typical simulation use case.

The ancillary do-file that illustrates usage of the command has been updated as well. There is also a new ancillary do-file that illustrates the use of the two component commands in the context of a simulation exercise.

Propensity score matching problem

$
0
0
Hi, everyone

I have a propensity score matching (PSM) problem and am desperately finding the solution.

To obtain the matching scores, I have to regress a dependent variable Y1 (treatment: 1 vs. control: 0) on a series of independent and control variables including X1, X2, and X3, etc. Then I have to identify the 5 (or other numbers) nearest neighbor matches for each treatment firm with replacement, form a new dataset, and conduct another regression analysis where the dependent variable is a new one such as firm performance Y2, the independent variable is Y1 (i.e., the prior dependent variable), and the control variables are now X4, X5, and X6, etc.

Does anyone know how to perform these procedures using STATA software language or drop-down menu? Thank you for your help. I appreciate it very much. Look forward to hearing from you.

Best,
Andy

Esttab export, append multiple panels .rtf

$
0
0
I am trying to append two panels into a single table using esttab with the append option.

However, the output below does not look ideal and still a lot like two separate tables.
1. The title is above the dependent variables
2. The dependent variables are listed twice
3. The table notes are twice
4. There is in general too much space between the two panels.
Array


Do you know how to fix this and really get one table with two panels?


Minimum working example

Code:
sysuse auto.dta, replace

// regressions
reg price mpg, robust
estimates store ols1
estadd local X "No", replace

reg price mpg rep78 headroom trunk, robust
estimates store ols2
estadd local X "Yes", replace

ivreg2 price (mpg=weight), robust
estimates store iv1
estadd local X "No", replace

ivreg2 price (mpg=weight) rep78 headroom trunk, robust first
estimates store iv2
estadd scalar kpF = e(rkf)
estadd local X "Yes", replace


// Export top panel
esttab ols1 ols2 using "table1.rtf", replace ///
    b(%8.3f) se(%8.3f) label keep(mpg) ///
    title("Panel A: OLS") star(* 0.10 ** 0.05 *** 0.01) ///
    stats(N X, labels("N" "Controls"))
    
// Export bottom panel
esttab iv1 iv2 using "table1.rtf", replace ///
    b(%8.3f) se(%8.3f) label keep(mpg) ///
    title("Panel B: IV") star(* 0.10 ** 0.05 *** 0.01) ///
    stats(kpF N X, labels("F-Stat" "N" "Controls"))    append

Stata - Fixed Effects &amp; Newey West Standard errors (Panel data.)

$
0
0
Hello everyone,

I have panel data: Firms & Years (2017-2023). I did already a Hausmann test and it showed me that I shoud use the Fixed Effect model.

The model is:
Array


I should do the regression equally as the reference paper:

Charifzadeh, Michel; Herberger, Tim A.; Högerle, Bernadette; Ferencz,
Marlene (2021): Working Capital Management und dessen Rolle
als Instrument zur Rentabilitäts- und Unternehmenswertsteuerung:
Eine empirische Untersuchung über deutsche Blue Chips. In Die
Unternehmung 75 (4)

They performed a multiple regression with the robust standard errors of Newey-West and also tested for fixed effects.

But as I read here in the forum, the Newey West cannot be used with fixed effects model. Is it valid to apply the following:

Does anyone have any idea how I can do this?

Thanks in advance!!

Measured confounders when using ivsvar

$
0
0
Hi,

I am designing an instrumental variable svar analysis and my instrument is likely associated with the outcome through both endogenous and exogenous (seasonality) measured confounders.

I am not sure if I can control for this by including my endogenous confounder as a dependent variable and months of the year as exogenous dummies? I would control for measured confounders in a typical IV analysis, but I have seen some information suggesting that the instrument has to be exogenous in IVSVAR. I have not been able to figure it out myself from the command documentation.

There is also some risk that lags of my target shock variable predict the instrument. Will lags of the target shock variable be included in the equation for the instrument?

Thank you for your help.

Erasing older files if exists

$
0
0
I am trying to execute some sytax to erase old archived input files that are two months old. I've placed a test file in my target archive folder, but it is not being erased. I've confirmed that the local for 'two months old' and my file name are in the same format and appear as 06May2024. Any advice on how to troubleshoot is appreciated.

Code:
*** 2.A Determining reference dates
********************************************

*** 2.A.1 Creating fake dataset to calculate dates of interest
// Need today's date and two months ago
// Using 9 weeks or 63 days ago (getting a target divisible by 7 for weekly automation)
insobs 1 gen today = date(c(current_date),"DMY") gen twomonthsago = today - 63 format today twomonthsago %tdDDMonCCYY *** 2.A.2 Moving formatted dates into macros quietly summarize today, mean local today: disp %tdDDMonCCYY r(mean) quietly summarize twomonthsago, mean local twomonthsago: disp %tdDDMonCCYY r(mean) *** 2.B Managing Copies of the HAPI Files ******************************************** *** 2.B.1 Erasing file from two months ago, if it exists if fileexists("`fldr_archive'\*`twomonthsago'.csv")== 1 /// erase "`fldr_archive'\*`twomonthsago'.csv" else di "No archive file to erase"

did_multiplegt_dyn error: didmgt_Var_all_XX not found

$
0
0
Dear Statalist

I am using did_multiplegt_dyn downloaded from the SSC (for estimating heterogeniety-robust DID; de Chaisemartin & D'Haultfoeuille, 2024, Difference-in-Differences Estimators of Intertemporal Treatment Effects) on Stata version 16.

I am nominally working with a larger dataset (15 variables, 1.2*10^6 observations), but in practice I am using a smaller subsample for simplicity as I prepare my code. My data is longitudinal and unbalanced, for example:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input double(var_1 var_2 var_3 ID) long YEAR byte(treated_1 cluster)
3850 285298.29921    202507 2370023009 2004 0 5
4050 294390.08914    195963 2370023009 2005 0 5
4000 505421.49687    404552 2370023009 2006 0 5
3751 378618.47563    266869 2370023009 2007 0 5
3645 425285.02748    289306 2370023009 2008 1 5
3805 471000.74992    304888 2370023009 2009 1 5
3283 570662.32445    405157 2370023009 2010 1 5
4032 590190.63912    298629 2370023009 2011 1 5
4314 724288.21162    408188 2370023009 2012 1 5
4510 764617.21335    437068 2370023009 2013 1 5
4510 839516.98757    475772 2370023009 2014 1 5
4930 880242.63312 464767.08 2370023009 2015 1 5
5442 719169.59029 406694.16 2370023009 2016 1 5
6280 849119.11759 473080.37 2370023009 2017 1 5
6567 111020.71734     27604 2370027009 2004 0 5
6661 128064.03995     42766 2370027009 2005 0 5
6754 140641.49231     66726 2370027009 2006 0 5
5563  134937.5374     55661 2370027009 2007 0 5
5814   133244.049     46556 2370027009 2008 0 5
5581 120180.65224     48602 2370027009 2009 0 5
5225 176240.74104     80450 2370027009 2010 0 5
5129 160180.52531     51499 2370027009 2011 0 5
5145  164450.5776     75849 2370027009 2012 0 5
5174 180331.39929     61031 2370027009 2013 0 5
5175  159211.5154     50907 2370027009 2014 0 5
5135 158655.13822     63240 2370027009 2015 0 5
5135 132289.24417  66011.71 2370027009 2016 0 5
5135 153429.84534  39369.13 2370027009 2017 0 5
5135 187312.69027  57855.03 2370027009 2018 0 5
end
I am running a loop with did_multiplegt_dyn as follows (note, I am also using event_plot from the SSC to graph the results):

Code:
 local vars "var_1 var_2 var_3"
    forvalues k = 1(1)2 {
        foreach v of local vars {
            did_multiplegt_dyn `v' ID YEAR treated_`k', effects(12) placebo(12) cluster(cluster) only_never_switchers ci_level(99)
            event_plot e(estimates)#e(variances), default_look graph_opt(xtitle("Years relative to switch") ytitle("Average effect of switching") xlabel(-12(-1)12) title(`"dCDH `v' `k'"')) stub_lag(Effect_#) stub_lead(Placebo_#) together
            graph save, replace
            matrix dcdh_b_`v'_`k' = e(estimates)
            matrix dcdh_v_`v'_`k' = e(variances)
        }
    }
As the loop runs the above did_multiplegt_dyn command I get the following output and corresponding error:

Code:
The number of placebos which can be estimated is at most 7.
The command will therefore try to estimate 7 placebo(s).

Effect_11 cannot be estimated.
There is no switcher or no control
for this effect.

Effect_12 cannot be estimated.
There is no switcher or no control
for this effect.

Placebo_7 cannot be estimated.
There is no switcher or no control
for this placebo.

Some placebos/effects could not be estimated.
Therefore, the command will not be compatible
with the honestdid command.

Some placebos could not be estimated.
Therefore, the test of joint nullity of the placebos
could not be computed.

--------------------------------------------------------------------------------
             Estimation of treatment effects: Event-study effects
--------------------------------------------------------------------------------

             |  Estimate         SE      LB CI      UB CI          N  Switchers
-------------+------------------------------------------------------------------
    Effect_1 |  -55.4194   26.01407  -122.4272    11.5884       9478         16
    Effect_2 | -19.33951   63.40962  -182.6719   143.9928       8673         16
    Effect_3 |  313.8926   73.26235   125.1813   502.6039       6848         12
    Effect_4 |  418.1752   69.77149   238.4557   597.8946       5622         11
    Effect_5 |   328.411   30.65127   249.4586   407.3634       3844          5
    Effect_6 |  645.4331   37.15429     549.73   741.1362       2933          4
    Effect_7 |   1181.55   49.06057   1055.178   1307.922       2565          4
    Effect_8 |  926.4066   53.48948   788.6268   1064.186       1906          3
    Effect_9 |  1111.936   54.67205   971.1099   1252.762       1687          3
   Effect_10 |  2894.371   52.96381   2757.945   3030.797        551          1
   Effect_11 |         .          .          .          .          0          0
   Effect_12 |         .          .          .          .          0          0
--------------------------------------------------------------------------------


--------------------------------------------------------------------------------
               Average cumulative (total) effect per treatment unit
--------------------------------------------------------------------------------

             |  Estimate         SE      LB CI      UB CI          N     Switch  x Periods
-------------+-----------------------------------------------------------------------------
  Av_tot_eff |  353.9419   24.02224   292.0647   415.8191      11775         75            
--------------------------------------------------------------------------------
Average number of time periods over which a treatment's effect is accumulated = 4.6619718


--------------------------------------------------------------------------------
          Testing the parallel trends and no anticipation assumptions
--------------------------------------------------------------------------------

             |  Estimate         SE      LB CI      UB CI          N  Switchers
-------------+------------------------------------------------------------------
   Placebo_1 | -3.288138   42.84637  -113.6531   107.0768       8610         16
   Placebo_2 |  352.3659   165.6031  -74.19952   778.9313       7527         16
   Placebo_3 |  444.1805   215.8812  -111.8926   1000.254       4912         11
   Placebo_4 |  361.0978   394.4885  -655.0373   1377.233       2496          7
   Placebo_5 | -115.2562   40.31975   -219.113   -11.3994       1130          2
   Placebo_6 |  3679.696   42.39853   3570.484   3788.907        484          1
   Placebo_7 |         .          .          .          .          0          0
--------------------------------------------------------------------------------
didmgt_Var_all_XX not found
r(111);
I have not defined a variable called didmgt_Var_all_XX prior to running the loop - I suspect it is an output from did_multiplegt_dyn (though I cannot find it in the program's documentation). I have tried to find references to the specified error elsewhere but I cannot find any reference to this problem nor on how to solve it.

I would greatly appreciate some help.

Many thanks,

Guy

rolling window: updated deciles after each iteration

$
0
0
Hello,

I have been conducting some analysis on persistency in fund managers' skill in creating value for their investors.

I have sorted all funds based on skill ratio each time, t, as variable decile.
To analyze the differences between the performance of different deciles, I have used the rangestat command in a rolling window mean estimation. Since I need to consider each decile at each time, I'm not sure if stata is doing it correctly.

I'm supposed to use a measurement horizon of 3y-10y and estimate the skill for each decile. The deciles are supposed to "update" after each iteration of the rolling window estimation.

The simplest way of putting it is that stata measures mean skill for all 10 deciles, that cathegorize the different funds, in the interval 1-36 first. Then it does the same but with different funds behind the decile cathegorization. In this way, I can check if there is persistency in skill.

Below is my code and my problem lies on the "by(decile)" part, where I dont seem to find a way to tell stata that each iteration is supposed to consider new funds behind the deciles (different from the previous iteration).

The deciles are correct as they are defined for each date, meaning each fund is put in 1 of 10 deciles for each date.

Code:
by fund: gen skillratio = mskill / varskill
bys date: astile decile = skillratio, nq(10) // sort all funds based on skillratio each time, t
rangestat (mean) skill, by(decile) interval(date 1 36) // m=36
rangestat (mean) skill, by(decile) interval(date 1 48) // m=48
rangestat (mean) skill, by(decile) interval(date 1 60) // m=60
rangestat (mean) skill, by(decile) interval(date 1 72) // m=72
rangestat (mean) skill, by(decile) interval(date 1 84) // m=84
rangestat (mean) skill, by(decile) interval(date 1 96) // m=96
rangestat (mean) skill, by(decile) interval(date 1 108) // m=108
rangestat (mean) skill, by(decile) interval(date 1 120) // m=120

When plotting the graph i get 10 lines that are pretty identical (though not fully identical) which does not coincide with the paper that I'm working with since it exhibits much more variation in the graphs.
Therefore, I assume that what I'm doing with the "by(decile)" part must be wrong since I would expect some more variation between the plotted graphs.

I have tried doing by(deciles date), however its flawed as it does not distinguish between the measurement horizons, m, and I only get the same graph across the different measurement horizons.

I have been looking around and not found anything on this problem so any help is greatly appreciated.

Beforehand, thank you for your insights!

Propensity score matching and ATT analysis for multiple treatments

$
0
0
Hello,

I want to analyse the ATT of an intervention using PSM. My treatment has three arms: the control group, treatment 1 receiving only treatment 1, and treatment 2 receiving both treatment 1 and 2. In a first step, I simply compared receiving any treatment to being in the control group using the following commands:
Code:
probit treated cov1 cov2 cov3
predict pscore_treated

foreach y of varlist dep1 dep2 dep3 {
psmatch2 treated, outcome(`y') pscore(pscore_treated) kernel common 
eststo nn_treated_`y'
esttab nn_pooled_dep1 nn_pooled_dep2 nn_pooled_dep3, keep(_treated)
}
Where treated is a binary variable indicating if any treatment was received, cov1-3 are the covariates I want to use for the matching, and dep1-3 are my different outcome variables of interest.

Now I want to look at the effects of the two treatments separately. As far as I have understood, that requires three different comparisons: Control vs T1, Control vs T2, T1 vs T2. I already estimated the probabilities of being in the different treatments using mlogit and estimated the pscores:
Code:
mlogit treatments cov1 cov2 cov3
predict pscore_control pscore_t1 pscore_t2
However, I struggle to find how to estimate the different effects using psmatch2. Using teffects with psmatch also does not seem to be an option with multiple treatments.

I am using Stata18.

Thank you!

xtdhreg variable not found error when estimating margins*

$
0
0
Hi Statalist

Thanks in advance for your help. I have been advised to use xtdhreg for zero-inflated strongly balanced panel dataset covering 3 waves of data collection, with 12,231 observations in total.

But when I run post estimations margins test, Stat returns "variable _one not found".

Code:
 qui xtdhreg cap_eggs_week_adj lnexp_weekly remit hh_marital, ptobit
 estimates store model2
 margins, dydx(lnexp_weekly remit) atmeans
remit hh_marital are dummy variables; cap_eggs_week_adj lnexp_weekly are continuous.

If I use regular dhreg (not xtdhreg) it returns the margins normally.

Please advise me on how to produce margins.

Additionally, does anyone else have experience using xtdhreg who could advise me in interpreting the results? Would I be better off using a different model estimation?

Thank you. Apologies in advance if I've formatted this wrong, this is my first post.

Zip

Creating two new address variables based on an existing variable

$
0
0
Hello, I have the variable PERSONALADDRESS which a sample of is shown below, I want to create two new variables based on this address variable, one variable must contain the street address and another variable must contain the city, state. zipcode, so for example I the address "29 BROKEN ARROW RD BRACKETVILLE, TX 78832 UNITED STATES" must be broken into two distinct variables, one must have 29 BROKEN ARROW RD the other address much be BRACKETVILE, TX 78832 United States. The issue I was having is that the addresses are composed of different lengths and different styles. Any help would be much appreciated.


Code:
* Example generated by -dataex-. For more info, type help dataex
clear
input str89 PERSONALADDRESS
"29 BROKEN ARROW RD BRACKETVILLE, TX 78832 UNITED STATES"        
"189 EMPRESARIO DR UBC, TX 78253 UNITED STATES"                  
"305 E LEROY ST THREE RIVERS, TEXAS 78071 UNITED STATES"         
""                                                               
"276 SADDLE HORN DR BANDERA, TX 78003 UNITED STATES"             
"756 PURPLE SAGE DR BANDERA, TX 78003 UNITED STATES"             
"151 SHADOW VALLEY SAN ANTONIO, TX 78227 UNITED STATES"          
"8134 GRISSOM SAN ANTONIO, TX 78251"                             
""                                                               
""                                                               
"1624 W SALINAS SAN ANTONIO, TX 78207 UNITED STATES"             
"11035 INNER CYN SAN ANTONIO, TX 78252"                          
"11035 INNER CYN SAN ANTONIO, TX 78252"                          
"11035 INNER CYN SAN ANTONIO, TX 78252"                          
"7938 HANGING BRANCH UBC, TX 78253 UNITED STATES"                
"1203 TENBURY SAN ANTONIO, TX 78253"                             
""                                                               
"213 DEL VALLE SAN ANTONIO, TX 78207"                            
"213 DEL VALLE SAN ANTONIO, TX 78207"                            
"710 NOVELLA AVE ADKINS, TX 78101"                               
"13003 ESSEN FOREST SAN ANTONIO, TX 78023 UNITED STATES"         
"7922 PARKLAND GREEN DR SAN ANTONIO, TX 78240 UNITED STATES"     
"5322 EBONY SAN ANTONIO, TX 78228"                               
""                                                               
"3106 CLIMBING ROSE SAN ANTONIO, TX 78230 UNITED STATES"         
"511 W BAYLOR SAN ANTONIO, TX 78204 UNITED STATES"               
"511 W BAYLOR SAN ANTONIO, TX 78204 UNITED STATES"               
"511 W BAYLOR SAN ANTONIO, TX 78204 UNITED STATES"               
"5110 BEVERLY DR SAN ANGELO, TEXAS 76904 UNITED STATES"          
"2915 HIGHCLIFF SAN ANTONIO, TX 78218"                           
"18595 HOLLYBRANCH CT PORTER, TEXAS 77365 UNITED STATES"         
"18595 HOLLYBRANCH CT PORTER, TEXAS 77365 UNITED STATES"         
"12606 MIDDLE LN SAN ANTONIO, TX 78217 UNITED STATES"            
"2193 CR 342 LA VERNIA, TX 78121 UNITED STATES"                  
"2815 BRIAFIELD SAN ANTONIO, TX"                                 
"2815 BRIAFIELD SAN ANTONIO, TX"                                 
"2815 BRIAFIELD SAN ANTONIO, TX"                                 
"2815 BRIAFIELD SAN ANTONIO, TX"                                 
"1630 CRYSTAL BRIDGES UBC, TX 78260 UNITED STATES"               
"1630 CRYSTAL BRIDGES UBC, TX 78260 UNITED STATES"               
"1630 CRYSTAL BRIDGES UBC, TX 78260 UNITED STATES"               
"8726 DISCOVERY WAY CONVERSE, TX 78109"                          
""                                                               
""                                                               
"3843 KILLARNEY SAN ANTONIO, TX 78223"                           
"3843 KILLARNEY SAN ANTONIO, TX 78223"                           
"1314 W HARLAN AVE SAN ANTONIO, TX 78211 UNITED STATES"          
"512 COUNTY ROAD 6717 NATALIA, TX 78059 UNITED STATES"           
"4609 ROSEWOOD DR MIDLAND, TEXAS 79707 UNITED STATES"            
"1120 S MARIPOSA AVE APT 5 LOS ANGELES, CA 90006"                
""                                                               
"E COMMERCE ST / HOEFGEN AVE SAN ANTONIO, TX 78205 UNITED STATES"
""                                                               
"11503 BEAR PAW PATH SAN ANTONIO, TX 78245"                      
"500 HARROLD ST FORT WORTH, TX 76107 UNITED STATES"              
"3602 PRINCE GEORGE DR SAN ANTONIO, TX 78230"                    
"3602 PRINCE GEORGE DR SAN ANTONIO, TX 78230"                    
"1611 BERNARD KERN DR EL PASO, TEXAS 79936 UNITED STATES"        
"1611 BERNARD KERN DR EL PASO, TEXAS 79936 UNITED STATES"        
"117 HARTE IRAAN, TEXAS 79744 UNITED STATES"                     
""                                                               
""                                                               
"5906 IMPERIAL TOPAZ SAN ANTONIO, TX 78222 UNITED STATES"        
""                                                               
"306 BICKLEY SAN ANTONIO, TX 78221 UNITED STATES"                
"1109 MADRID SAN ANTONIO, TX 78237 UNITED STATES"                
"1109 MADRID SAN ANTONIO, TX 78237 UNITED STATES"                
"2801 MEADOW VIEW COMMERCE, TX 75428 UNITED STATES"              
""                                                               
"202 ESTATE SAN ANTONIO, TX 78220 UNITED STATES"                 
"202 ESTATE SAN ANTONIO, TX 78220 UNITED STATES"                 
"202 ESTATE SAN ANTONIO, TX 78220 UNITED STATES"                 
"202 ESTATE SAN ANTONIO, TX 78220 UNITED STATES"                 
"7031 HEATHERS WAY SAN ANTONIO, TX 78227 UNITED STATES"          
"2432 W THIRD ST . MADERA, CALIFORNIA 93637 UNITED STATES"       
"2023 SHADOW CLIFF SAN ANTONIO, TX 78232 UNITED STATES"          
"2023 SHADOW CLIFF SAN ANTONIO, TX 78232 UNITED STATES"          
"219 VERBENA HILL SAN ANTONIO, TX 78258 UNITED STATES"           
"5511 HWY 71 BEE CAVE, TX 78738 UNITED STATES"                   
"10607 COUGAR CHASE SAN ANTONIO, TX 78251 UNITED STATES"         
"9210 MIMOSA MANOR SAN ANTONIO, TX 78245"                        
"6611 ARANCIONE AVE SAN ANTONIO, TX 78233 UNITED STATES"         
"175 HARTFORD SAN ANTONIO, TX 78223 UNITED STATES"               
"1095 SW COUNTY ROAD CHILHOWEE, MO 44733 UNITED STATES"          
"302 E PALM DR FRESNO, TEXAS 77545 UNITED STATES"                
"7314 WESTGLADE PLACE SAN ANTONIO, TX 78227 UNITED STATES"       
""                                                               
"1160 W HIGHWAY 85 DILLEY, TX 78017"                             
"315 MARTHAS LN SOMERSET, TEXAS 78069 UNITED STATES"             
""                                                               
"10314 CLEARWATER WAY SAN ANTONIO, TX 78223 UNITED STATES"       
"10444 GREEN BRANCH SAN ANTONIO, TEXAS 78223 UNITED STATES"      
"10314 CLEARWATER WAY SAN ANTONIO, TX 78223 UNITED STATES"       
"239 CENTER ST SAN ANTONIO, TX 78202 UNITED STATES"              
"239 CENTER ST SAN ANTONIO, TX 78202 UNITED STATES"              
"20064 FM 523 ANGLETON, TX 77515 UNITED STATES"                  
""                                                               
""                                                               
""                                                               
"6706 KINGSBURY DR DALLAS, TX 75201 UNITED STATES"               
end
Viewing all 65136 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>