Hi all, I got a problem and stuck there for quite a while.
I have two data sets: case and control. they have exactly same variables but different firms. there are few things I need to do .
1, match the case firms with control firms by industrycode2 and IR in range 0.9-1.1 of IR of case firm for a given year.
2, once matching complete, I need to compute the median of the matched firms of the case firm for the given year and the year after.
a little example,
data of the case data set
data of the control data set
then I run the code below to complete the step 1
and then the example of matching result.
now I need to proceed to step 2, which means to compute the median of IR of firm "715910825" , "160236012" , "180963807" (which are matched with firm "101101628") ot only in year 2003 but also 2004 (one year later of the priva_year)
and tried merge, but it won't work since there are multiple case firms matched by one control firm , and if the observations of the control firm are less than the number of matched case firms , there will be some kind of mess, cause some of the matching connection will no longer exist, thus I can't compute the median based on this method.
thanks in advance for any help suggestion.
I have two data sets: case and control. they have exactly same variables but different firms. there are few things I need to do .
1, match the case firms with control firms by industrycode2 and IR in range 0.9-1.1 of IR of case firm for a given year.
2, once matching complete, I need to compute the median of the matched firms of the case firm for the given year and the year after.
a little example,
data of the case data set
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float year str9 firmid float(industrycode2 priva_year) double IR 1998 "101101628" 3300 2003 .5110084414482117 1999 "101101628" 3300 2003 .3630363345146179 2000 "101101628" 3300 2003 .38560914993286133 2001 "101101628" 3300 2003 .33924877643585205 2002 "101101628" 3300 2003 .33550626039505005 2003 "101101628" 3300 2003 .20422077178955078 2004 "101101628" 3300 2003 .11267662048339844 2005 "101101628" 3300 2003 .09437507390975952 2006 "101101628" 3300 2003 .08572352677583695 2007 "101101628" 3300 2003 .11519040167331696 1998 "101105573" 1300 1999 .07120344787836075 1999 "101105573" 1300 1999 .10620743781328201 2000 "101105573" 1300 1999 .16636085510253906 2001 "101105573" 1300 1999 .07258064299821854 2002 "101105573" 1300 1999 .021276595070958138 1999 "101113645" 4000 2002 .34291747212409973 2002 "101113645" 4000 2002 .7364082932472229 2003 "101113645" 4000 2002 .9800625443458557 2004 "101113645" 4000 2002 .8390606045722961 2005 "101113645" 4000 2002 .37066158652305603 2006 "101113645" 4000 2002 .31881824135780334 2007 "101113645" 4000 2002 .4274105727672577 end
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float year str9 firmid float industrycode2 double IR 1998 "160236012" 3300 .1932249516248703 1999 "160236012" 3300 .24407915771007538 2000 "160236012" 3300 .18691548705101013 2001 "160236012" 3300 .1931752860546112 2002 "160236012" 3300 .13328635692596436 2003 "160236012" 3300 .1913904845714569 2004 "160236012" 3300 .16301876306533813 2005 "160236012" 3300 .15310858190059662 2006 "160236012" 3300 .14023752510547638 2007 "160236012" 3300 .1261952966451645 1998 "180963807" 3300 .2148050218820572 1999 "180963807" 3300 .24316874146461487 2000 "180963807" 3300 2.407111167907715 2001 "180963807" 3300 .34913963079452515 2003 "180963807" 3300 .19960474967956543 2004 "180963807" 3300 .110478475689888 2002 "715910825" 3300 .1362578570842743 2003 "715910825" 3300 .19080890715122223 2004 "715910825" 3300 .14886681735515594 2005 "715910825" 3300 .08703862130641937 2007 "715910825" 3300 .27025702595710754 end
Code:
clear use "E:\case.dta" by firmid: keep if year ==priva_year gen lower_IR = IR*0.9 gen upper_IR = IR*1.1 // matching rangejoin IR lower_IR upper_IR using "E:\control.dta", by(industrycode2) all // industry and IR match bys firmid: keep if year ==year_U save "E:\Research\privatization\match.dta",replace
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float year str9 firmid float(industrycode2 priva_year) double IR str9 firmid_U float year_U double IR_U 2003 "101101628" 3300 2003 .20422077178955078 "715910825" 2003 .19080890715122223 2003 "101101628" 3300 2003 .20422077178955078 "160236012" 2003 .1913904845714569 2003 "101101628" 3300 2003 .20422077178955078 "180963807" 2003 .19960474967956543 end
and tried merge, but it won't work since there are multiple case firms matched by one control firm , and if the observations of the control firm are less than the number of matched case firms , there will be some kind of mess, cause some of the matching connection will no longer exist, thus I can't compute the median based on this method.
thanks in advance for any help suggestion.