failing to rename a variable by its label

March 24, 2020, 6:58 am

≫ Next: Robustness checks - post regression graph

≪ Previous: Plotting Groups of Fixed Effects Coefficients, By Group, By Year

Dear all,

I am interested in renaming several variables by its label (see an example below), however, after trying the code suggested by Nick Cox
in the following post

HTML Code:

https://www.statalist.org/forums/forum/general-stata-discussion/general/1367292-rename-variable-with-its-own-label

Code:

. foreach v in IngresosdeexplotaciónmilEUR S T U V W X Y Z AA AB AC AD AE AF AG AH AI AJ
> AK AL AM AN {
  2.     local lbl : var label `v'
  3.     local lbl = strtoname("`lbl'")
  4.     rename `v' `lbl'
  5. }
variable Ingresos_de_explotación
mil_EUR already defined
r(110);

As you can see it fails. Any hint of what might be happening?
My interest is to rename using the first two words and the year "ingresos_explotacion_year" (even though using the whole label is not a problem) but cannot figure out how to do it.

Have you got any suggestions?
If instead of using the label (because label exceeds max allowed), is it possible to do it with a loop? I am worried of having to do it for nearly more than 400 variables.

Thanks for the help.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str17(IngresosdeexplotaciónmilEUR Ingresos_de_explotación
mil_EUR T U V W X Y Z AA AB AC AD AE AF AG AH AI AJ AK AL AM AN)
data width (259 chars) exceeds max linesize. Try specifying fewer variables
r(1000);

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str17(S T U V W X Y Z AA AB AC AD AE AF AG AH AI AJ AK AL AM AN)
"n.d." "22595593" "19651680" "16161180" "18874398" "22465185" "24287783" "25537175" "20152095" "16514555" "12807244" "18952916" "18457817" "18048055" "15872465" "12471983" "10935400" "10482412" "11555475"   "12947750"   "9856050"    "8521250"   
"n.d." "22323170" "21072429" "19823515" "19077481" "18458967" "18062450" "17552041" "16476333" "15270146" "14428567" "14308686" "13008010" "11305219" "9619766"  "8160861"  "6714947"  "5383831"  "4190484.77" "3130177.01" "2532272.11" "2024728.69"
"n.d." "12546822" "12622285" "11767670" "12382735" "12409513" "12064764" "12007707" "10451069" "8734069"  "7340616"  "5978153"  "4429450"  "3742321"  "2913928"  "2251028"  "1654744"  "1376792"  "1129545"    "959650"     "586620"     "35346.44839"
"n.d." "11551899" "11732409" "11001619" "10649650" "10025200" "9783995"  "9971343"  "10855490" "11446045" "11587826" "11957626" "12638574" "12276595" "11603123" "10671343" "9822262"  "8824087"  "8127077"    "7504480"    "6819020"    "6108030"   
"n.d." "10495100" "10013400" "9023100"  "8805300"  "8007400"  "6927300"  "6499800"  "5635900"  "5073300"  "4492500"  "5183900"  "5958647"  "5879585"  "5530476"  "6175243"  "5892629"  "6001484"  "6180221"    "6622040"    "6019590"    "5149540"   
"n.d." "8161218"  "8120635"  "7555509"  "7464915"  "7447843"  "7453987"  "7610992"  "8720877"  "8965750"  "9241473"  "9963897"  "9833661"  "9291333"  "9118136"  "8573989"  "7846555"  "6935452"  "6817715"    "6877750"    "3054490"    "3001930"   
"n.d." "7621970"  "8338006"  "8591648"  "7087795"  "5403570"  "4633775"  "4621277"  "5069761"  "5029228"  "4628056"  "4095911"  "4648421"  "4497158"  "5199066"  "5564046"  "5600126"  "4907792"  "5045490"    "5582785.22" "5428407.17" "4807401.5" 
"n.d." "7217991"  "7226263"  "6554127"  "6113094"  "7818886"  "8684904"  "n.d."     "n.d."     "1310568"  "1397581"  "1001815"  "2164422"  "1356"     "1810"     "1288"     "3586"     "2393"     "2409"       "3100"       "3608.19"    "4127.939"  
"n.d." "6830814"  "5446768"  "4619628"  "5404399"  "6692432"  "6862128"  "6799981"  "6964958"  "5361728"  "4351212"  "6451078"  "5603375"  "5370269"  "4910238"  "3740331"  "3302357"  "2831875"  "3154059"    "3658820"    "2321750"    "2088129"   
"n.d." "4865000"  "4530000"  "4238000"  "4363000"  "3985000"  "4004000"  "4711000"  "4807000"  "4710008"  "n.d."     "n.d."     "n.d."     "n.d."     "n.d."     "n.d."     "n.d."     "n.d."     "n.d."       "n.d."       "n.d."       "n.d."      
end

↧

Robustness checks - post regression graph

March 24, 2020, 7:00 am

≫ Next: Help for replacing parts of a string variable

≪ Previous: failing to rename a variable by its label

hey stata friends and family,
After I use rreg / mmregress / mregress, I want to create a scatter plot (that would consider the weights of observations) with a regression line.
I sometime use this line after I calculate weights:
twoway (lfit Y X [weight=weight])(scatter Y X [weight=weight], msymbol(oh))

Yet it seems too complicated.

In addition, what robustness check of the three you would say is better for which cases, and why? I am looking for a more common straightforward explanation.

Thank you in advance,
Yair

↧

Help for replacing parts of a string variable

March 24, 2020, 7:04 am

≫ Next: Parallel trends assumption test for a two way fixed effect model (omitted treatment variable)

≪ Previous: Robustness checks - post regression graph

Dear community,

I want to clean a string variable that contains the name of a bank. Sometimes, it also includes some numeric content (a percentage indication) as the example from dataex shows below:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str79 Lender str20 LenderCountry_clean
"ABN AMRO BIBF"                 ""          
"ABN AMRO BIBF"                 ""          
"ABN AMRO BIBF"                 ""          
"ABN AMRO BIBF"                 ""          
"ABN AMRO BIBF"                 "Thailand"  
"ABN AMRO BIBF 10.00%"          ""          
"ABN AMRO BIBF 5.40%"           ""          
"ABN AMRO BIBF 6.00%"           ""          
"ABN AMRO BIBF 6.13%"           ""          
"ABN AMRO BIBF 6.29%"           ""          
"ABN AMRO BIBF 7.42%"           ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd"             ""          
"ABN AMRO Bank Bhd 1.50%"       ""          
"ABN AMRO Bank Bhd 1.50%"       ""          
"ABN AMRO Bank NV"              ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""            
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        ""          
"ABN AMRO Bank NV [RBS]"        "Netherlands"
"ABN AMRO Bank NV [RBS]"        "Netherlands"
"ABN AMRO Bank NV [RBS]"        "Netherlands"
"ABN AMRO Bank NV [RBS] 1.10%"  ""          
"ABN AMRO Bank NV [RBS] 20.00%" ""          
"ABN AMRO Bank NV [RBS] 32.62%" ""          
"ABN AMRO Bank NV [RBS] 5.00%"  ""          
"ABN AMRO Bank NV [RBS] 6.67%"  ""          
"ABN AMRO Inc"                  ""          
"ABN AMRO Inc"                  ""          
"ANZ Grindlays Bank Plc"        ""          
end

My objective is to complete the observations of the variable LenderCountry_clean if the bank name is the same. Now, what I did so far was to use the following command to fill up the observations:

Code:

bysort Lender (LenderCountry_clean) : replace LenderCountry_clean = LenderCountry_clean[_N] if missing(LenderCountry_clean) & _n < _N

However, the problem are now the observations that contain some values (percentages), i.e. 1.10%, 20.00%. In this case, the bank name is not exactly the same and I cannot fill up the observations as I would like to. For example, for the lender "ABN AMRO Bank NV [RBS] 1.10%" I would like to have the information "Netherlands" for the variable LenderCountry_clean.

Hence, my question is if there is a way to "clean" the variable Lender to get rid of the percentages. Note that these percentage indications, if there are any, are always at the end of the string. Is there a way to clean the variable Lender such that I can fill up the observations with the information from LenderCountry_clean ?

I greatly appreciate any help or comments and thank you in advance.

Best wishes,
Elio

↧

Parallel trends assumption test for a two way fixed effect model (omitted treatment variable)

March 24, 2020, 7:16 am

≫ Next: Set base group for 2 variables when including interaction terms in a regression

≪ Previous: Help for replacing parts of a string variable

Hi everyone,

I want to test for the parallel trend assumption in the difference in differences. My research design is about testing the change in corporate social responisbility behaviour due to european csr regulation (in force for fiscal year 2017).

I'm trying to estimate this model with company fixed effects and year fixed effects.

Code:

reghdfe CSRW_TOT i.TREATMENT##i.MANDATE LNSALE LEVERAGE, absorb(YEAR COMP_ID) vce(cluster COMP_ID)

My panel data is yearly (xtset COMP_ID YEAR).
COMP_ID is the Company ID
CSRW_TOT is a variable that indicates the percentage of "CSR words" in company reports
MANDATE is a dummy variable that is equal to 1 for the years the regulation entered into force (YEAR 2017 & 2018) and 0 before (YEAR 2015 & 2016)
TREATMENT is a dummy variable that is equal to 1 when a Company is effected by the european "CSR Directive" (COUNTRY UK & Germany) and equal to 0 for the control group (S&P 500 companies)

How can i check for the parallel trend assumption for this fixed effect model?
I have tried several possibilities, but since the TREATMENT variable is probably collinear with the fixed effects the TREATMENT variable is always omitted.

These are some possibilities which i tried:

Code:

reghdfe CSRW_TOT i.TREATMENT##c.YEAR LNSALE LEVERAGE, absorb(YEAR COMP_ID) vce(cluster COMP_ID)

Code:

tfdiff CSRW_TOT TREATMENT LNSALE LEVERAGE, datatype(panel) model(fe) tvar(YEAR) t(2017) pvar(COMP_ID) test_pt vce(r)

Code:

xtreg CSRW_TOT i.TREATMENT#i.MANDATE LNSALE LEVERAGE, fe vce(r)
margins i.TREATMENT#i.MANDATE, noestimcheck

please let me know if you need any more information.
Thank you,
David

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str75 COMPANY float(COMP_ID YEAR) double(LNSALE LEVERAGE) float(CSRW_TOT TREATMENT MANDATE)
"3M Co"                            542 2017  17.22033468834878   .69  .08498428 0 1
"3M Co"                            542 2016 17.225799816304647  .682  .08498428 0 0
"3M Co"                            542 2015 17.275637007155716  .635  .08498428 0 0
"3M Co"                            542 2018  17.27046985113653  .727  .08498428 0 1
"4imprint Group PLC"               517 2018 13.408326974122144  .585      .0638 1 1
"4imprint Group PLC"               517 2017  13.15809917914767  .591      .0648 1 1
"4imprint Group PLC"               517 2015 12.869211547849371  .667      .0606 1 0
"4imprint Group PLC"               517 2016  13.06790719160827  .658 .064500004 1 0
"ADIDAS AG"                        390 2017  16.83562315144792  .537      .0606 1 1
"ADIDAS AG"                        390 2015 16.620635804820733  .555  .05610669 1 0
"ADIDAS AG"                        390 2018  17.07545769027168  .575      .0605 1 1
"ADIDAS AG"                        390 2016 16.760954558243554  .553  .05739328 1 0
"AES Corp"                         266 2017 16.424550408733385  .828      .0525 0 1
"AES Corp"                         266 2016 16.521625554431974  .817  .05081058 0 0
"AES Corp"                         266 2015 16.656983812489194  .815  .05831644 0 0
"AES Corp"                         266 2018  16.16973888411016    .8 .063096926 0 1
"ANSYS Inc"                        415 2017 13.803908513783359  .225      .0488 0 1
"ANSYS Inc"                        415 2018 13.906493205676146  .182      .0518 0 1
"ANSYS Inc"                        415 2015 13.749393191105485  .186      .0501 0 0
"ANSYS Inc"                        415 2016 13.756559597295334  .199      .0548 0 0
"AO World PLC"                     413 2016 13.537657356844537  .715      .0612 1 0
"AO World PLC"                     413 2018 13.843167552916222  .691      .0621 1 1
"AO World PLC"                     413 2017 13.620971470160084  .799      .0598 1 1
"AO World PLC"                     413 2015 13.397514767758244  .615      .0605 1 0
"ASOS PLC"                          52 2017 14.405132759460136  .655 .063599996 1 1
"ASOS PLC"                          52 2015 14.220279423751995  .503      .0621 1 0
"ASOS PLC"                          52 2016 14.324550989221905  .685      .0606 1 0
"ASOS PLC"                          52 2018 14.792408217380673  .564      .0597 1 1
"AURUBIS AG"                       299 2018 16.351779695579236   .43  .07022011 1 1
"AURUBIS AG"                       299 2016 16.308439691402608  .504      .0543 1 0
"AURUBIS AG"                       299 2015 16.349547155458584  .512      .0516 1 0
"AURUBIS AG"                       299 2017 16.131392356646437  .457      .0573 1 1
"AXEL SPRINGER AG"                 343 2015 15.055311596720092  .611 .063757695 1 0
"AXEL SPRINGER AG"                 343 2016 15.125129577132137  .588      .0504 1 0
"AXEL SPRINGER AG"                 343 2018 15.291126521280688  .551      .0514 1 1
"AXEL SPRINGER AG"                 343 2017 15.066932913954416  .561  .06430401 1 1
"Abiomed Inc"                      385 2016 12.621959920354175 -.009      .0532 0 0
"Abiomed Inc"                      385 2015 12.269924993804983  .005      .0586 0 0
"Abiomed Inc"                      385 2017 12.935023650630288  .123      .0527 0 1
"Abiomed Inc"                      385 2018 13.206691658108232  .036      .0569 0 1
"Accenture PLC"                     34 2016 17.254818389229275  .558  .07719437 0 0
"Accenture PLC"                     34 2015  17.23452690141206  .609  .08001322 0 0
"Accenture PLC"                     34 2018 17.395074100813453   .52  .07718616 0 1
"Accenture PLC"                     34 2017  17.32361956169649  .526  .07996055 0 1
"Adobe Inc"                        358 2015 15.237911413086874  .403  .05413578 0 0
"Adobe Inc"                        358 2016 15.383190829964633  .416   .0517677 0 0
"Adobe Inc"                        358 2017 15.582709197584865  .418      .0515 0 1
"Adobe Inc"                        358 2018 15.803591049253225  .501  .05274135 0 1
"Aggreko PLC"                      467 2016 14.629430522647521  .444      .0495 1 0
"Aggreko PLC"                      467 2015 14.700639567606773  .448      .0528 1 0
"Aggreko PLC"                      467 2018 14.686331008753667  .468      .0594 1 1
"Aggreko PLC"                      467 2017 14.452507933514392  .478       .053 1 1
"Agilent Technologies Inc"          76 2015  15.21817027317043  .442  .07440205 0 0
"Agilent Technologies Inc"          76 2016 15.211755249937333  .456  .07244961 0 0
"Agilent Technologies Inc"          76 2017 15.260309648456728  .426  .08055538 0 1
"Agilent Technologies Inc"          76 2018 15.345039087170072  .465  .06942448 0 1
"Air Products and Chemicals Inc"   602 2016 16.086594663896257  .596  .07937477 0 0
"Air Products and Chemicals Inc"   602 2017 15.833063397965265  .443      .0614 0 1
"Air Products and Chemicals Inc"   602 2015 16.162523934992667  .575   .0795564 0 0
"Air Products and Chemicals Inc"   602 2018 15.958126409418364  .414   .0749048 0 1
"Akamai Technologies Inc"           74 2017 14.665682427285581  .273      .0504 0 1
"Akamai Technologies Inc"           74 2016 14.602807245007792  .261  .05635711 0 0
"Akamai Technologies Inc"           74 2018 14.732998972330334  .412      .0494 0 1
"Akamai Technologies Inc"           74 2015 14.490429611052392  .254  .05631889 0 0
"Albemarle Corp"                     3 2018 14.937831560719182  .503 .063021526 0 1
"Albemarle Corp"                     3 2016 15.110603412111853  .513 .066984534 0 0
"Albemarle Corp"                     3 2017  14.80028315078173  .506 .065956555 0 1
"Albemarle Corp"                     3 2015 14.709779786656881  .643   .0659261 0 0
"Alexion Pharmaceuticals Inc"      411 2015 14.619184734906389  .371      .0569 0 0
"Alexion Pharmaceuticals Inc"      411 2018 15.082767972610094  .342      .0582 0 1
"Alexion Pharmaceuticals Inc"      411 2016 14.772577331302859  .344      .0603 0 0
"Alexion Pharmaceuticals Inc"      411 2017 14.941738013665358  .345      .0637 0 1
"Alliant Energy Corp"              658 2017  15.01547534089267  .691 .070985526 0 1
"Alliant Energy Corp"              658 2018 15.034036943297721   .69  .07658236 0 1
"Alliant Energy Corp"              658 2015 15.024560452030492  .686 .065864205 0 0
"Alliant Energy Corp"              658 2016 14.995272633575151  .696   .0712737 0 0
"Ameren Corp"                      279 2015 15.616064574873825    .7  .06809589 0 0
"Ameren Corp"                      279 2016 15.623471406530337  .707  .06604554 0 0
"Ameren Corp"                      279 2017   15.6198571426978  .718  .05003837 0 1
"Ameren Corp"                      279 2018 15.636343274678037  .714  .07125351 0 1
"American Airlines Group Inc"      804 2017 17.508879917351063  .923  .06164163 0 1
"American Airlines Group Inc"      804 2018  17.55809664202746 1.003  .06483074 0 1
"American Airlines Group Inc"      804 2015 17.568537831901963  .877  .05764483 0 0
"American Airlines Group Inc"      804 2016  17.52883869248052  .924  .04859415 0 0
"American Water Works Company Inc" 254 2018 15.026558275962172  .724  .08498428 0 1
"American Water Works Company Inc" 254 2015 14.917891735440003  .707  .08498428 0 0
"American Water Works Company Inc" 254 2017  15.01003890346221  .724      .0786 0 1
"American Water Works Company Inc" 254 2016 14.965766079784222  .718  .08498428 0 0
"AmerisourceBergen Corp"            44 2015  18.63564184614919  .977      .0566 0 0
"AmerisourceBergen Corp"            44 2017 18.814814595167743  .942  .06436231 0 1
"AmerisourceBergen Corp"            44 2018 18.861776107998942  .919  .06877332 0 1
"AmerisourceBergen Corp"            44 2016 18.750576021142535  .937    .060219 0 0
"Ametek Inc"                       745 2018  15.27416511476603   .51      .0478 0 1
"Ametek Inc"                       745 2016 15.195357931850126  .541      .0811 0 0
"Ametek Inc"                       745 2015 15.207280898503916  .512      .0467 0 0
"Ametek Inc"                       745 2017  15.16100558055726  .483      .0474 0 1
"Amgen Inc"                        805 2017 16.950613392966073  .684   .0598076 0 1
"Amgen Inc"                        805 2016  16.89107013137782  .615  .05831049 0 0
"Amgen Inc"                        805 2015 16.814387880662338  .608  .05422583 0 0
"Amgen Inc"                        805 2018  16.94441791067068  .812  .05772042 0 1
end
label values TREATMENT treatment
label def treatment 0 "Control", modify
label def treatment 1 "Treatment", modify
label values MANDATE mandate
label def mandate 0 "Pre", modify
label def mandate 1 "Post", modify

↧

Set base group for 2 variables when including interaction terms in a regression

March 24, 2020, 7:37 am

≫ Next: Results of the Hsiao procedure for the homogeneity test on panel data

≪ Previous: Parallel trends assumption test for a two way fixed effect model (omitted treatment variable)

Dear all,

I am currently running a regression in Stata that involves an interaction term between years and countries (this is a simplified regression because I am training myself in interaction terms). What I would like to do is set the base group of 2 variables for this regression, namely country and year. I want Base country=USA, and Base year= 2011. The code that I use to do this looks as follows:

Code:

fvset base 88 country // set USA  as base country, country dummy observation that is dropped
fvset base 2011 year   // set 2011 as base year
reg logp i.country#i.year,  baselevels

When I run the regression however, it seems that the code only works for the case where year=2011. In every other year, the base country is not USA anymore. I test this by running the regression without the year interaction for specific years to confirm this. I confirm that in these cases another country is chosen as the base country. Therefore, my question is the following. Would anyone have a clue why my regression is not setting USA as base country for years other than 2011? I presume I have an error in my code, but I cannot find it. Any help would be greatly appreciated.

Best,

Satya

↧

Results of the Hsiao procedure for the homogeneity test on panel data

March 24, 2020, 8:08 am

≫ Next: Gologit 2 interpretation

≪ Previous: Set base group for 2 variables when including interaction terms in a regression

Hi Statalists,

I work on a panel of 5 individuals and 38 years. I performed the Hsiao test to test the homogeneity of the data. Unlike most cases, I obtained p-values> 0.1 for the three tests. I ask you to help me choose the specification of my model and which estimation technique I should use and should I do the other tests like the Haussmann test and which post-estimation test I will have to do after estimating the model.
Thank you in advance for your assistance.
P-values of tests:

Array

↧

Gologit 2 interpretation

March 24, 2020, 8:13 am

≫ Next: Extracting Residuals from loop regressions

≪ Previous: Results of the Hsiao procedure for the homogeneity test on panel data

Hi everyone, I have read Williams' article about Gologit2 model. I have some concerns about the interpretation of gamma, namely variables unconstrained to meet the parallel-lines assumptions. If I have dependent variable 1= Strongly Disagree, 2= Disagree, 3= Agree and 4 =SA and I have beta coefficient of year=0.07, gamma 2 for year= 0.009 and gamma 3 for year= -0.012, which is the correct interpretation? An increase of year it is associated with a higher level of dependent variable?

↧

Extracting Residuals from loop regressions

March 24, 2020, 8:16 am

≫ Next: How is 'iweight' treated by the tabulate two-way command?

≪ Previous: Gologit 2 interpretation

Dear all,

I would like to obtain my residuals from my loop regressions, but I can't find a way to extract them.
I tried different ways, such as the command predict residuals, but this gives me a prediction of the residuals and these do not match with the residuals from the regressions.
The purpose is that I obtain the different residuals from my regressions and create the variable 'residuals'.

Is someone able to explain to me the difference and help me further how to extract them?

↧

How is 'iweight' treated by the tabulate two-way command?

March 24, 2020, 8:26 am

≫ Next: Panel data model for policy intervention

≪ Previous: Extracting Residuals from loop regressions

Hi,

I'm working with the American Community Survey (ACS), a dataset with 3mio + observations. I often have to tabulate two variables I've created, namely naics2017 and skill (both numeric floats). I have a fine-grained version of naics2017, with over 400 unique values, and a broader version, with only 20 unique values. My skill variable always only takes four values. All values are integers, but can be negative.

When using my 20-value naics2017 variable, I can easily use my default ACS svyset and svy commands:

Code:

svyset cluster [pweight = perwt], strata(strata)
svy: tab naics2017 skill, row

It takes my computer a full two minutes to process, but that's fine. When I try to do the same thing with my 400+ value naics2017 variable, the calculation never ends and stata eventually just freezes.

I think I've found a decent work-around that enables me to use the same code for both versions of my naics2017 variable. The following code (without the svyset command) works without putting my computer into a coma:

Code:

tab naics2017 skill [iweight = perwt], row

But the thing is: I have never used the iweight option before, and I don't quite understand what calculation is behind it if I use it with two-way tabulation in this way. The help command for iweight says that "any command that supports iweights will define exactly how they are treated." However, under help for 'tabulate twoway' I can't find anything about iweights.

Can anybody tell me where to look to learn how tabulate twoway uses iweights? Or can somebody tell me if I am using the iweight option correctly? I am optimistic because using these two commands with my 20-value naics2017 variable gets me the same results as my standard pweight option with the svyset command. But I want to understand the iweight option for the tabulate command, and would be grateful if somebody could point me toward a useful explanation.

The ACS person weight variable btw is perwt:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int perwt
161
128
 71
148
159
 53
131
 85
312
 77
 96
 40
 48
142
  7
250
 59
 73
 89
 58
174
 74
 86
 69
133
 12
 66
 31
 36
 31
227
111
125
226
 46
159
 74
 59
 97
121
 55
198
 17
193
 22
125
 63
147
 94
 54
110
 49
105
122
 45
233
189
103
 44
 50
173
118
 44
 16
174
 79
 60
 18
102
 92
124
195
116
 53
 62
 33
 60
111
113
 39
161
 38
 67
172
133
236
 34
101
137
270
 31
 74
 65
 26
199
 30
 85
 61
163
107
end

↧

Panel data model for policy intervention

March 24, 2020, 8:32 am

≫ Next: Generating a binary variable conditional on multiple values from another variable, to create Southern and Northern states.

≪ Previous: How is 'iweight' treated by the tabulate two-way command?

Research question: I am writing a master thesis where I am analyzing the impact of a policy on the flow of investments from country A to country B.

Data: I have highly(!) unbalanced yearly panel data on investment flows for 9 years (policy were implemented on year 5), i. e. microdata for the investment flows of every company resided in country A and investing in country B.

Research design: My independent variable is individual investment flows in US dollars. I have a main independent variable, which is Policy coded as dichotomous dummy variable (0 = before policy implementation, 1 = after policy implementation). Plus, I have several control macroeconomic variables, such as GDP growth in a receiving country B, market size in a country B, etc.

Statistical model: That's where my questions begin. In the last weeks, I read several articles and statistics books on panel data analysis and decided that the most suitable model for my design and data available is One-way fixed time effects model (with LSDV as estimator for time dimension). Unfortunately, it is not possible to do two-way fixed effects as each year there are firms which exit and enter the investment market (that's what I meant with "highly unbalanced").

My questions:

Is that a right approach? Can this data be used for the model described?
I also want to analyze how this policy shift impacted investments in various economic sectors (e.g. finance, agriculture, energy, etc.). In the dataset, I have the information which investment goes to which sector. Can I create a categorical variable with sectors to see how the sector affiliation impacts the flow? Can the sector category be Interpreted as individual effects allowing for the complete FE model?
I also want to see how different components of the policy influenced the investments. E.g. for example there is a tax deduction for agriculture, there is low interest rate for energy, and both for finance.
Maybe, I want too much and have to consider more than one model?

I would be glad to receive your advice and any other hints.

↧

Generating a binary variable conditional on multiple values from another variable, to create Southern and Northern states.

March 24, 2020, 8:41 am

≫ Next: Different 'count' results, copying and pasting code

≪ Previous: Panel data model for policy intervention

I have a set of survey data, where I have the "Statefip code" for the US state an individual lives in. I want to create two binary variables:

1) NorthernStates: which has a value of 1 if they live in a "northern" state, and 0 otherwise
2) SouthernStates: which has a value of 1 if they live in a "southern state, and 0 otherwise

ID	Statefip	Northern	Southern
234	1	0	1
546	12	0	1
876	18	1	0
432	33	1	0
156	6	0	0

The definition I'm using for Southern state is when Statefip equals 1, 5, 12, 13, 22, 28, 37, 45, 47, 48, 51. These numbers correspond to Alabama, Arkansas, Florida, Georgia, Louisiana, Mississippi, North Carolina, South Carolina, Tennessee, Texas, Virginia.

The definition I'm using for Northern state is when Statefip equals 9, 10, 17, 18, 23, 24, 25, 26, 33, 34, 36, 39, 42, 44, 50, 55. These numbers correspond to delaware, Illinois, Indiana, Maine, Maryland, Massachusetts, Michigan, New Hampshire, New Jersey, New York, Ohio, Pennsylvania, Rhode Island, Vermont, Wisconsin.

So for a west-coast or midwestern state (for example, California where Statefip==6) would equal 0 for both Southern and Northern variables.

Does anyone have a more elegant way to create these variables than me manually creating dummies for each state, and then adding them up to create the Southern and Northern states.

↧

Different 'count' results, copying and pasting code

March 24, 2020, 9:02 am

≫ Next: Observations drop in xtlogit procedure

≪ Previous: Generating a binary variable conditional on multiple values from another variable, to create Southern and Northern states.

Hello,

I am running Stata 14.2 on Windows 10. While using the count command to count the number of observations with certain values on two variables, I noticed the result in the output was incorrectly given as 1. So, I rewrote the code on the next line and the result in the output was correctly given as 22. I am running both lines of code sequentially, with no other commands in between them, and I continue to get two these two different results. Of note, I think I may have copied and pasted the first line of code (which produces the incorrect result) from elsewhere in my .do file. Additionally, as I posted this code into the forum window, I noticed that in the first line of code, the value for spq_1d is displayed as 'l', but in the second line it is displayed as '1'. In my .do, all values display as '1'. Does anyone know why this might be happening? I have copied and pasted within .do files in the past, and never noticed this issue. Is it a bad practice to copy and paste within a .do file?

Code:

            count if spq_1d==l & spq_1b!=1
            count if spq_1d==1 & spq_1b!=1

Example of my data:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long researchid byte(study_visit spq_1d spq_1b)
2000 3 0 1
2000 4 2 2
2000 5 1 1
2001 2 . .
2001 2 2 2
2001 3 0 2
end
label values study_visit study_visit
label def study_visit 2 "Baseline", modify
label def study_visit 3 "Year 1", modify
label def study_visit 4 "Year 2", modify
label def study_visit 5 "Year 3", modify

I would very much appreciate any assistance.

Thank you,
Joseph

↧

Observations drop in xtlogit procedure

March 24, 2020, 9:15 am

≫ Next: characteristic contents too long

≪ Previous: Different 'count' results, copying and pasting code

Dear Madam/Sir,

I ran xtlogit procedure (logit fixed-effect model or conditional logit) and saw the following message that many observations dropped. It will be greatly appreciated if you give me an idea what does it mean.

. xtset gvkey
panel variable: gvkey (unbalanced)

. xtlogit futureres gafscore mascore ln_at financing fo sq_segs roa loss lev mb return big4 indspe auditdelay icw stenure i.fyear,
> fe
note: multiple positive outcomes within groups encountered.
note: 1,045 groups (8,388 obs) dropped because of all positive or
all negative outcomes.

Thank you

Sincerely,
HJ

↧

characteristic contents too long

March 24, 2020, 9:28 am

≫ Next: Inserting a new row into the data with gen and if statements

≪ Previous: Observations drop in xtlogit procedure

I have two sets of categorical variables, each of which takes on a small number of integer values. I'm trying to form all interactions between members of the two sets using the -xi- command at line 4 of the loop below. Ultimately the interactions will be fed into a lasso regression, but at the moment, just generating them results in a "characteristic contents too long" error.

I haven't found much on this error. This thread https://www.statalist.org/forums/for...tents-too-long discusses the error in the context of -reshape-, but it seems pretty specific to that command.

From the sound of the error, and the discussion in the above thread, I thought I had inadvertently tried to construct interactions with an id variable or something else with lots of values. The table at the bottom shows that this is not the case, since all of my categorical variables take on values between 0 and 4.

Finally, the prefix() option of -xi- allows for a 4-character prefix. The largest prefix that results from the loop is _378, so it looks like I should be ok there.

Code:

. *** loop test
. 
. qui d,s

. disp r(k)
100

. 
. local n=0

. capture drop _*

. unab qtest : $dashqs

. unab chtest : $chcat1

. foreach q of local qtest {
  2.         foreach v of local chtest {
  3.                 local n=`n'+1
  4.                 qui xi i.`q'*i.`v',prefix(_`n') noomit
  5.         }
  6. }
characteristic contents too long
    The maximum value of the contents is 67,784.

. qui d _*,full

. disp r(k)
5446

Code:

. sum $dashqs

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
   question1 |     16,203    .3972721    .7096666          0          2
   question2 |     16,203    .6209344    .7598174          0          2
   question3 |     16,203    .6661729    .7682009          0          2
   question4 |     16,203    .4974387    .7898037          0          2
   question5 |     16,203    .5583534    .7951362          0          2
-------------+---------------------------------------------------------
   question6 |     16,203    .8128742     .717489          0          2
   question7 |     16,203    .4954638    .7745655          0          2
   question8 |     16,203    .5668703     .790337          0          2
   question9 |     16,203    .5152132    .7729136          0          2
  question10 |     16,203    .5037956    .7752067          0          2
-------------+---------------------------------------------------------
  question11 |     16,203    .4039375    .7872436          0          2
  question12 |     16,203    .4059742    .7896028          0          2
  question13 |     16,203    .5898908    .7954562          0          2
  question14 |     16,203    .5867432    .7968132          0          2
  question15 |     16,203    .6465469    .7919159          0          2
-------------+---------------------------------------------------------
  question16 |     16,203    .4920694    .8082302          0          2
  question17 |     16,203    .5070049    .8101456          0          2
  question18 |     16,203    .5339752    .8102253          0          2
  question19 |     16,203    .4866383    .8143213          0          2
  question20 |     16,203    .4383756    .8070643          0          2
-------------+---------------------------------------------------------
  question21 |     16,203    .5215701    .8104981          0          2
  question22 |     16,203     .438499    .8071502          0          2
  question23 |     16,203    .5563167    .8025451          0          2
  question24 |     16,203     .760785    .7557867          0          2
  question25 |     16,203    .5861877    .8089423          0          2
-------------+---------------------------------------------------------
  question26 |     16,203    .5074986    .8086924          0          2
  question27 |     16,203     .826637    .7211523          0          2

. sum $chcat1

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
atvictimhome |     16,203    .7191261    .5818269          0          2
  expartners |     16,203    .6197001    .6068465          0          2
familyperp~r |     16,203    .0002469    .0157106          0          1
gendervictim |     16,203    .1627476    .3691467          0          1
perpetrato~l |     16,203    .3179041    .4656764          0          1
-------------+---------------------------------------------------------
perpetrato~e |     16,203    1.435228    .9764152          0          4
  roleswitch |     16,203    .2380423    .4258983          0          1
 victimdrugs |     16,203    .0209838    .1433343          0          1
victiminjury |     16,203      .10424     .305581          0          1
victim_cat~e |     16,203    1.284762    .9585312          0          4
-------------+---------------------------------------------------------
victimalco~l |     16,203    .2095908    .4070292          0          1
perpetrato~s |     16,203    .0767142    .2661456          0          1
perpetrato~y |     16,203    .0271555    .1625413          0          1
perpetrato~e |     16,203    1.435228    .9764152          0          4

↧

Inserting a new row into the data with gen and if statements

March 24, 2020, 9:41 am

≫ Next: store R squared and residual of foreach loop regression

≪ Previous: characteristic contents too long

Hi Statalisters,

I have data on Indian administrative units for the year 1991. Sample using dataex is below. On trying to merge this data with my master data I found that I was missing a few districts within a few states. So I obtained the details of the missing districts and I want to incorporate it into the sample data given below. So I wrote the following code as an example, obviously using gen when the variable shdist is already in place resulted in me failing to add this new info I want in my data:

Code:

// I want to incorporate district number 40 if state == 13 then generate a corresponding string variable = "Bilaspur" if shdist == 40 & state == 13
gen shdist = 40 if state == 13
gen district_labels = "Bilaspur" if shdist == 40 & state == 13 // MP
gen shdist = 34 if state == 13
gen district_labels = "Narsimhapur" if shdist == 34 & state == 13
gen shdist = 3 if state == 14
gen district_labels = "Raigarh" if shdist == 3 & state == 14 // Maharashtra
gen shdist = 27 if state == 14

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str27 district_labels byte shdist str17 sv024 long state
"SRIKAKULAM"          1 "andhra pradesh" 1
"VIZIANAGARAM"        2 "andhra pradesh" 1
"VISAKHAPATNAM"       3 "andhra pradesh" 1
"EAST GODAVARI"       4 "andhra pradesh" 1
"WEST GODAVARI"       5 "andhra pradesh" 1
"KRISHNA"             6 "andhra pradesh" 1
"GUNTUR"              7 "andhra pradesh" 1
"PRAKASAM"            8 "andhra pradesh" 1
"NELLORE"             9 "andhra pradesh" 1
"CHITTOOR"           10 "andhra pradesh" 1
"CUDDAPAH"           11 "andhra pradesh" 1
"ANANTAPUR"          12 "andhra pradesh" 1
"KURNOOL"            13 "andhra pradesh" 1
"MAHBUBNAGAR"        14 "andhra pradesh" 1
"MEDAK"              17 "andhra pradesh" 1
"NIZAMABAD"          18 "andhra pradesh" 1
"ADILABAD"           19 "andhra pradesh" 1
"NALGONDA"           23 "andhra pradesh" 1
"KARIMNAGAR"         20 "andhra pradesh" 1
"WARANGAL"           21 "andhra pradesh" 1
"KHAMMAM"            22 "andhra pradesh" 1
"HYDERABAD"          16 "andhra pradesh" 1
"Dhubri"              1 "assam"          3
"Bongaigaon"          3 "assam"          3
"Goalpara"            4 "assam"          3
"Barpeta"             5 "assam"          3
"Nalbari"             6 "assam"          3
"Kamrup"              7 "assam"          3
"Sonitpur"            9 "assam"          3
"Lakhimpur"          10 "assam"          3
"Dhemaji"            11 "assam"          3
"Marigaon"           12 "assam"          3
"Nagaon"             13 "assam"          3
"Golaghat"           14 "assam"          3
"Jorhat"             15 "assam"          3
"Sibsagar"           16 "assam"          3
"Tinsukia"           18 "assam"          3
"Karbi Anglong"      19 "assam"          3
"Karimganj"          21 "assam"          3
"Hailakandi"         22 "assam"          3
"Cachar"             23 "assam"          3
"Dibrugarh"          17 "assam"          3
"Begusarai"          17 "bihar"          4
"Madhubani"          20 "bihar"          4
"Madhepura"          22 "bihar"          4
"Bhojpur"             3 "bihar"          4
"Rohtas"              4 "bihar"          4
"Patna"               1 "bihar"          4
"Nalanda"             2 "bihar"          4
"Jehanabad"           6 "bihar"          4
"Gaya"                7 "bihar"          4
"Nawada"              8 "bihar"          4
"Saran"               9 "bihar"          4
"Siwan"              10 "bihar"          4
"Gopalganj"          11 "bihar"          4
"Purba Champaran"    13 "bihar"          4
"Sitamarhi"          14 "bihar"          4
"Muzaffarpur"        15 "bihar"          4
"Vaishali"           16 "bihar"          4
"Samastipur"         18 "bihar"          4
"Darbhanga"          19 "bihar"          4
"Saharsa"            21 "bihar"          4
"Purnia"             23 "bihar"          4
"Katihar"            24 "bihar"          4
"Khagaria"           25 "bihar"          4
"Bhagalpur"          27 "bihar"          4
"Godda"              28 "bihar"          4
"Sahibganj"          29 "bihar"          4
"Dumka"              30 "bihar"          4
"Deoghar"            31 "bihar"          4
"Munger"             26 "bihar"          4
"Pashchim Champaran" 12 "bihar"          4
"Goa"                 1 "goa"            6
"JAMNAGAR"            1 "gujarat"        7
"RAJKOT"              2 "gujarat"        7
"SURENDRANAGAR"       3 "gujarat"        7
"BHAVNAGAR"           4 "gujarat"        7
"AMRELI"              5 "gujarat"        7
"JUNAGADH"            6 "gujarat"        7
"KACHCHH"             7 "gujarat"        7
"BANASKANTHA"         8 "gujarat"        7
"SABARKANTHA"         9 "gujarat"        7
"MAHESANA"           10 "gujarat"        7
"GANDHINAGAR"        11 "gujarat"        7
"AHMADABAD"          12 "gujarat"        7
"KHEDA"              13 "gujarat"        7
"PANCHMAHALS"        14 "gujarat"        7
"VADODARA"           15 "gujarat"        7
"BHARUCH"            16 "gujarat"        7
"SURAT"              17 "gujarat"        7
"VALSAD"             18 "gujarat"        7
"KARNAL"              5 "haryana"        8
"SONIPAT"             7 "haryana"        8
"ROHTAK"              8 "haryana"        8
"AMBALA"              1 "haryana"        8
"YAMUNANAGAR"         2 "haryana"        8
"KURUKSHETRA"         3 "haryana"        8
"FARIDABAD"           9 "haryana"        8
"GURGAON"            10 "haryana"        8
"MAHENDRAGARH"       12 "haryana"        8
end
label values state state
label def state 1 "andhra pradesh", modify
label def state 3 "assam", modify
label def state 4 "bihar", modify
label def state 6 "goa", modify
label def state 7 "gujarat", modify
label def state 8 "haryana", modify

Do let me know where I am going wrong and how to circumvent it.

Best,
Lori

↧

store R squared and residual of foreach loop regression

March 24, 2020, 9:53 am

≫ Next: Why is xtpedroni Limited To 6 Variables?

≪ Previous: Inserting a new row into the data with gen and if statements

Dear all,

For our master paper, we need to replicate the Fama and French three factor model. We started with the one factor model.
At the moment, we have 520 variables. We have one dependent variable: MktRF (Fama and French one factor model) and 519 independent variables (companies) with the daily excess returns over 20 years.
We were able to execute our loop regression (for our 519 firms: from the variable SOLVAY to ADECCOGROUP), by executing the following command:

foreach x of varlist SOLVAY-ADECCOGROUP{
regress `x' MktRF
}

Now, we have our 519 regressions results and we want to store the 519 residual values in a new variable and store the 519 R-squared values in another variable.

For the residual values, we tried to use this command:
predict residual, r

But unfortunately, with this command, Stata gives us the daily residuals and not the residuals per firm obtained from the loop regression.

For the R-squared values, we tried to use this command:
tempname results output
foreach x of varlist SOLVAY-ADECCOGROUP {
regress `x' MktRF
mat `results' = r(r2)
mat `output' = (nullmat(`output') \\`results'[1,1])
}
mat colnames `output' = R-squared
svmat `output', names(col)

With this command, Stata gives me an error code for the svmat `output', names(col) command:
invalid syntax
r(198);

Does anyone know how we can solve this?

↧

Why is xtpedroni Limited To 6 Variables?

March 24, 2020, 10:00 am

≫ Next: Installing plssem in Stata version 14.1

≪ Previous: store R squared and residual of foreach loop regression

Hi everyone,

This is my first post in this forum after several years of lurking.

My question is: how come I cannot run PDOLS regressions with more than 6 independent variables with xtpedroni? Instead I get the message "not positive definitive" and some diff/demeaned variables are created. Is there a theoretical reason behind it?

I checked on the online Stata literature and forum posts but I could not find any explicit explanation of it. I tried this with other datasets and I get the same result, so I am pretty sure it is not just due to my dataset.

I am using Stata/SE 13.1.

Thank you in advance.

↧

Installing plssem in Stata version 14.1

March 24, 2020, 10:53 am

≫ Next: generate variable which is calculating percentage

≪ Previous: Why is xtpedroni Limited To 6 Variables?

Dear Stata users,

I am trying to install the plssem package using . ssc install plssem. Unfortunately the package is not compatible with my Stata version (14.1). I get the following error message:

this is version 14.1 of Stata; it cannot run version 15.1 programs
You can purchase the latest version of Stata by visiting http://www.stata.com.
r(9);

Is there any way I can install plssem in my version of Stata?

Any help would be greatly appreciated.

Thank you and best wishes
Simona

↧

generate variable which is calculating percentage

March 24, 2020, 11:10 am

≫ Next: Custom labels in a coefplot.

≪ Previous: Installing plssem in Stata version 14.1

Hello guys,

im relatively new to Stata and im stuck with an assignment. It goes like this:
Use the variable “womenpar” to calculate the percentage of men in parliament (tip: use the generate command for this). Make a new variable called “menpar” reporting this.

Im unsure whether i can do this with only one variable(womenpar) or if i should find another variable in the dataset. I was looking for a relevant variable but did not find anything.
I hope you have a suggestion to a proper command, thanks.

↧

Custom labels in a coefplot.

March 24, 2020, 11:51 am

≫ Next: Replacing string using regexm/regexs

≪ Previous: generate variable which is calculating percentage

Dear All,

I am estimating a regression postestimation plot using a user written coefplot. Following is my code and attached graph it generates:

Code:

coefplot, drop(treated2 wk_* post _cons) ///
 title("(1) Prescriptions") xline(29.5) yline(0) /*xsc(r(-4))*/  ///
 ytitle(Change) xtitle(Week relative to KY's new laws) vertical omitted addplot(dot y x, mfcolor(white))  ///
 xlabel(1 10 20 30 40 50 60 70 80 90 100) coeflabels(t1="" t10="2012w10" t20="2012w20" t30="2012w30" t40="2012w40" t50="2012w50" t60="2012w60" ///
 t70="2012w70" t80="2012w80" t90="2012w90",  angle(45))

Despite specifying the custom coeflabels the resulting plot does not have the right labels. How much I change the labelling?

Sincerely,
Sumedha.
Array

↧