Hello everyone! I am having this problem (look in the attached file) during an attempt to install fre and any other commands. The path showed in red not even the path where is stata folder exists. I've tried to launch as an admin, didn't help. My OS is Windows 10 and version - Stata 16. Please, could you help me? Thank you!
↧
Error (603)
↧
Remove all ID duplicates except the last.
Hi. I am pretty new to STATA and I find myself struggling with a dataset that I have been working on, I hope someone can help me out.
My dataset shows a list of hospital departments and how many patients has been on the given departments for a specific period.
The first id duplicates show patients which moved to the department from other hospital departments, and the last id duplicate shows the sum of all the patients which has been on the department at that given period.
As I am only interested in the last id duplicate which shows the total sum of patients at the specific departments, I have been trying for a long way to figure out how to drop all id duplicates but the last, alas unsuccesfull.
I have tried to read up on the different duplicates commands and search online but I cant figure out to get it right.
Any help would be appreciated.
Sincerely
My dataset shows a list of hospital departments and how many patients has been on the given departments for a specific period.
The first id duplicates show patients which moved to the department from other hospital departments, and the last id duplicate shows the sum of all the patients which has been on the department at that given period.
As I am only interested in the last id duplicate which shows the total sum of patients at the specific departments, I have been trying for a long way to figure out how to drop all id duplicates but the last, alas unsuccesfull.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str62 afdeling str26 Startbelægning str11 Nyindlagte str8 Tilflyttede str11 Sengedøgn "OUH Anæstesi-Intensiv Afs. Vita (Odense) (4202011)" "" "" "" "" "OUH Anæstesi-Intensiv Afs. Vita (Odense) (4202011)" "8,00" "" "115,00" "236,85" "OUH Anæstesi-Intensiv Afs. Vita (Odense) (4202011)" "3,00" "" "35,00" "108,96" "OUH Anæstesi-Intensiv Afs. Vita (Odense) (4202011)" "" "" "1,00" "0,08" "OUH Anæstesi-Intensiv Afs. Vita (Odense) (4202011)" "11,00" "" "151,00" "345,88" "OUH Barselsafsnit Vuggen (Odense) ()" "" "" "" "" "OUH Barselsafsnit Vuggen (Odense) ()" "" "" "" "" "OUH Brystkirurgisk Afs. (Odense) ()" "" "" "" "" "OUH Brystkirurgisk Afs. (Odense) ()" "" "39,00" "17,00" "9,94" "OUH Brystkirurgisk Afs. (Odense) ()" "" "39,00" "17,00" "9,94" "OUH Endokrinologisk Afs. MCS (Odense) ()" "" "" "" "" "OUH Endokrinologisk Afs. MCS (Odense) ()" "" "4,00" "" "8,99" "OUH Endokrinologisk Afs. MCS (Odense) ()" "10,00" "73,00" "13,00" "320,87" "OUH Endokrinologisk Afs. MCS (Odense) ()" "" "" "1,00" "0,00" "OUH Endokrinologisk Afs. MCS (Odense) ()" "" "" "1,00" "1,19" "OUH Endokrinologisk Afs. MCS (Odense) ()" "" "" "1,00" "5,07" "OUH Endokrinologisk Afs. MCS (Odense) ()" "10,00" "77,00" "16,00" "336,12" "OUH Endokrinologisk Afs. ME (Odense) ()" "" "" "" "" "OUH Endokrinologisk Afs. ME (Odense) ()" "5,00" "13,00" "7,00" "199,99" "OUH Endokrinologisk Afs. ME (Odense) ()" "5,00" "13,00" "7,00" "199,99" end
Any help would be appreciated.
Sincerely
↧
↧
Graphics not displaying correctly with update to Big Sur
Updated to Big Sur (macOS 11) and now when a graph is created it's entirely gray. If the window is resized or the editor is toggled on and off, the graphics will appear. Exporting appears to function fine, dumping the desired graph
↧
ASROL command to calculate statistics - 3 trimesters per person
Hi Everyone,
Please, could anyone help me?
I am trying to use calculate the average of my main variable per trimester of birth. I need to create therefore 3 means variables one per trimester.
My data set has years and months of birth and the city of birth. Thus if a person was born in August 2001, she will have 3 variables_mean: Trimester3 corresponding to the average of months 8,7,6 of year 2001; Trimester2 = average of months 5,4,3 of year 2001; Trimester1 = average of months 2,1 (year 2001) and month 12 (year 2000).
The variable of interest varies per municipality of birth.
I am trying to use the command ASROL, but I am not getting it right.
The data is below.
Many thanks!
Please, could anyone help me?
I am trying to use calculate the average of my main variable per trimester of birth. I need to create therefore 3 means variables one per trimester.
My data set has years and months of birth and the city of birth. Thus if a person was born in August 2001, she will have 3 variables_mean: Trimester3 corresponding to the average of months 8,7,6 of year 2001; Trimester2 = average of months 5,4,3 of year 2001; Trimester1 = average of months 2,1 (year 2001) and month 12 (year 2000).
The variable of interest varies per municipality of birth.
I am trying to use the command ASROL, but I am not getting it right.
The data is below.
Many thanks!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long city float(year month pounds) 3500105 2000 11 8.893595 3500105 2000 12 9.779661 3500105 2001 1 8.885256 3500105 2001 2 10.613245 3500105 2001 3 9.326846 3500105 2001 4 12.051405 3500105 2001 5 11.074306 3500105 2001 6 12.389572 3500105 2001 7 13.54998 3500105 2001 8 21.330805 3500105 2001 9 24.322857 3500105 2001 10 16.31012 3500105 2001 11 10.40746 3500105 2001 12 17.042206 3500105 2002 1 9.985758 3500105 2002 2 8.9549055 3500105 2002 3 10.66387 3500105 2002 4 10.508817 3500105 2002 5 10.20454 3500105 2002 6 12.255505 3500105 2002 7 14.589113 3500105 2002 8 16.103226 3500105 2002 9 28.74085 3500105 2002 10 27.44958 3500105 2002 11 14.592 3500105 2002 12 11.710034 3500105 2003 1 10.322106 3500105 2003 2 11.73029 3500105 2003 3 9.415467 3500105 2003 4 10.967172 3500105 2003 5 12.821796 3500105 2003 6 14.488111 3500105 2003 7 16.742008 3500105 2003 8 17.876831 3500105 2003 9 23.13024 3500105 2003 10 17.579388 3500105 2003 11 11.701017 3500105 2003 12 11.356686 3500105 2004 1 11.913955 3500105 2004 2 10.188246 3500105 2004 3 10.160151 3500105 2004 4 9.966292 3500105 2004 5 13.522099 3500105 2004 6 12.356362 3500105 2004 7 13.012424 3500105 2004 8 20.36766 3500105 2004 9 34.670647 3500105 2004 10 21.836933 3500105 2004 11 11.615482 3500105 2004 12 10.693392 3500204 2000 11 10.358713 3500204 2000 12 8.871467 3500204 2001 1 9.15312 3500204 2001 2 9.650501 3500204 2001 3 9.463419 3500204 2001 4 10.463036 3500204 2001 5 10.950296 3500204 2001 6 11.301744 3500204 2001 7 14.441365 3500204 2001 8 20.066906 3500204 2001 9 22.26807 3500204 2001 10 15.506084 3500204 2001 11 9.636392 3500204 2001 12 8.872421 3500204 2002 1 12.398956 3500204 2002 2 10.35131 3500204 2002 3 9.885847 3500204 2002 4 12.279608 3500204 2002 5 11.875782 3500204 2002 6 13.750143 3500204 2002 7 15.27816 3500204 2002 8 16.03661 3500204 2002 9 27.543165 3500204 2002 10 24.555077 3500204 2002 11 13.9462 3500204 2002 12 10.713799 3500204 2003 1 9.22687 3500204 2003 2 10.448536 3500204 2003 3 10.020823 3500204 2003 4 11.394024 3500204 2003 5 12.38753 3500204 2003 6 15.488603 3500204 2003 7 18.095093 3500204 2003 8 20.52884 3500204 2003 9 21.885326 3500204 2003 10 16.879602 3500204 2003 11 12.138964 3500204 2003 12 9.985505 3500204 2004 1 8.992638 3500204 2004 2 9.131886 3500204 2004 3 10.326132 3500204 2004 4 10.864487 3500204 2004 5 13.133472 3500204 2004 6 12.328797 3500204 2004 7 12.959672 3500204 2004 8 19.037 3500204 2004 9 31.31487 3500204 2004 10 18.96614 3500204 2004 11 11.524967 3500204 2004 12 10.09326 end
↧
Predicted probabilities after Logistic Regression with interactions
Hello,
I would like to compare and test the predicted probabilities of a 3-way interaction term in a logistic regression model. I would appreciate your advice on whether my approach outlined below is appropriate. Each of the four variables are binary variables. The dataset used is a slight modification of the opt-2by2by2 data from Michael Mitchell's book interpreting regression models (opt_dum was created as 0 if opt<52, and 1 otherwise).
I use the following margins command to obtain the predicted probabilities for each variable and value (output not shown).
If I would like to compare and test the predicted probabilities, say for the mild#HT#Summer term (as compared to the relevant reference categories), would the following code be appropriate?
Thank you,
Caroline
I would like to compare and test the predicted probabilities of a 3-way interaction term in a logistic regression model. I would appreciate your advice on whether my approach outlined below is appropriate. Each of the four variables are binary variables. The dataset used is a slight modification of the opt-2by2by2 data from Michael Mitchell's book interpreting regression models (opt_dum was created as 0 if opt<52, and 1 otherwise).
Code:
logit opt_dum i.depstat##i.treat##i.season, or -------------------------------------------------------------------------------------- opt_dum | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------------------+---------------------------------------------------------------- depstat | Mild | .4642857 .4214978 -0.85 0.398 .0783483 2.751321 | treat | HT | 32.5 23.62599 4.79 0.000 7.818066 135.1037 | depstat#treat | Mild#HT | .2153846 .2373315 -1.39 0.164 .0248473 1.867027 | season | Summer | 4.970588 3.237407 2.46 0.014 1.386786 17.81583 | depstat#season | Mild#Summer | .704142 .761078 -0.32 0.746 .0846509 5.857184 | treat#season | HT#Summer | .5633136 .6170592 -0.52 0.600 .0658167 4.821306 | depstat#treat#season | Mild#HT#Summer | 9.129652 14.33193 1.41 0.159 .4209394 198.0108 | _cons | .1538462 .0826286 -3.49 0.000 .0536931 .4408138 --------------------------------------------------------------------------------------
Code:
margins i.depstat##i.treat##i.season, post coeflegend
Code:
margins r.depstat#r.treat#r.season, contrast(effects)
Caroline
↧
↧
Errors encountered while importing SPSS files into Stata
I am trying to import several SPSS files into Stata using the import spss command. With two particular files, I encounter errors which prevent me from importing. The first looks like this:
This error occurs even when I only try to import variables which I know to be strings from the file using the import spss dialog box.
When trying to import another SPSS file, I encounter this error:
Trying to import just the first observation from these files does not work either.
Strangely enough, the import spss dialog box variable preview displays all of the variables, so clearly Stata can access the file and its contents but can't import it. I don't have access to SPSS, so I can't open the files to see what the problem is.
Any help is appreciated.
Code:
. import spss using "file1.sav",clear note: invalid numeric format % note: invalid numeric format % note: invalid numeric format % note: invalid numeric format % note: invalid numeric format % note: invalid numeric format % error reading file r(692);
When trying to import another SPSS file, I encounter this error:
Code:
. import spss using "file2.sav",clear error reading file r(692);
Strangely enough, the import spss dialog box variable preview displays all of the variables, so clearly Stata can access the file and its contents but can't import it. I don't have access to SPSS, so I can't open the files to see what the problem is.
Any help is appreciated.
↧
how to change possition of observations using a loop
Dear all, I am trying to properly use a loop for change some observations I have in bottom lines into the upper lines. For that I use a basic loop (first one) that works well, but it is too large. Since I must to do it for several blocks of different variables, I am wondering if something more elaborated, similar to the second loop I show below is possible. Of course, this second loop only works for the first variable, but I do not know how to automate for the rest of variables in the command (Mn_x2 Mn_x3). See below also an example of the data at hand.
Many thanks in advance.
Many thanks in advance.
Code:
foreach i in Mn_x1 Mn_x2 Mn_x3 { replace `i' = `i'[24] in 1 replace `i' = `i'[25] in 2 replace `i' = `i'[26] in 3 replace `i' = `i'[27] in 4 replace `i' = `i'[28] in 5 replace `i' = `i'[29] in 6 replace `i' = `i'[30] in 7 replace `i' = `i'[31] in 8 replace `i' = `i'[32] in 9 replace `i' = `i'[33] in 10 replace `i' = `i'[34] in 11 replace `i' = `i'[35] in 12 replace `i' = `i'[36] in 13 replace `i' = `i'[37] in 14 replace `i' = `i'[38] in 15 replace `i' = `i'[39] in 16 replace `i' = `i'[40] in 17 replace `i' = `i'[41] in 18 replace `i' = `i'[42] in 19 replace `i' = `i'[43] in 20 replace `i' = `i'[44] in 21 replace `i' = `i'[45] in 22 replace `i' = `i'[46] in 23 } local j = 23 foreach i in Mn_x1 Mn_x2 Mn_x3 { forvalues z = 1/23 { local j =`j'+1 replace `i' = `i'[`j'] in `z' } }
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(M_x0 Mn_x1 Mn_x2 Mn_x3) 1725 . . . 353 . . . 265 . . . 279 . . . 250 . . . 238 . . . 203 . . . 199 . . . 183 . . . 191 . . . 159 . . . 158 . . . 222 . . . 140 . . . 204 . . . 227 . . . 232 . . . 160 . . . 144 . . . 334 . . . 296 . . . 277 . . . 319 . . . . 1725 649 25 . 647 981 622 . 492 875 572 . 431 875 580 . 377 886 601 . 512 1028 708 . 479 1073 685 . 425 1029 613 . 412 1098 659 . 371 1077 655 . 350 986 607 . 451 1215 695 . 501 1181 615 . 432 910 486 . 379 715 356 . 376 663 317 . 358 559 293 . 337 499 273 . 320 618 291 . 606 910 375 . 617 765 236 . 623 617 273 . 623 798 312 end
↧
Extract a variable list based on variable format?
Hi,
I am trying to loop through datasets to process the date variables the same way (e.g., convert a daily date to "month" to check the frequencies over the course of the year using -tab-). Different datasets have different numbers of/names of date variables, but whether the dataset has 1 or 15 dates, I want it to do the same thing to all of them per dataset. For example
r
I am trying to loop through datasets to process the date variables the same way (e.g., convert a daily date to "month" to check the frequencies over the course of the year using -tab-). Different datasets have different numbers of/names of date variables, but whether the dataset has 1 or 15 dates, I want it to do the same thing to all of them per dataset. For example
- Dataset 1 might have just date_of_birth plus a dozen other non-date variables.
- Dataset 2 might have date_of_birth date_of_death admission_date and discharge_date plus another dozen non-date variables (and unfortunately not always with the same naming convention across all the date variables).
- But all the dates are formatted as %td.
- I'm imagining we'd have to loop through the full variable list in the open dataset, flag each variable that has format=%td and then store the list of found variables as a macro variable, after which I'd loop through each value in the macro variable to do whatever (e.g., recode date_of_birth to a variable stored as %tm and then -tab- the new monthly date variable and repeat for any additional date vars).
- I see in the -local- extended functions you can extract attributes for a given variable, but not extract variables based on a particular attribute (e.g., when format is %td). Perhaps this could be used to set a condition for the loop imagined above?
r
↧
Rangerun AND Loops - Variables within program not getting created
Explanation: I realize that rangerun itself emulates a loop structure. But I want to perform it for multiple variables. When I run my program, no new variables are created and I want to know if there is a fatal flaw in my thinking. I also tried trace on and found out that the `thing' from inside the foreach loop is not getting passed to the program three_mths.
My code looks almost exactly like this
I realize that you might ask me to unwrap the loop and call rangerun multiple times for each variable high, low, open and close separately. However, the actual dataset I am working with has more than 20 variables. And I need to do the percentile thing for 3, 6, 9, and 12 months - which is why getting this small code correct is so important.
I am also open to other suggestions which might make this easier and/or correct.
Thank you so much, in advance.
My code looks almost exactly like this
Code:
sysuse sp500.dta, clear drop volume change global things high low open close clear programs program three_mths egen `1'_3m = pctile(`1'), p(25) label variable `1'_3m "Top 25 - last 3 months" end set trace on foreach thing in $things { rangerun three_mths, use(`thing') inter(date -90 -1) }
I am also open to other suggestions which might make this easier and/or correct.
Thank you so much, in advance.
↧
↧
How to make vertical dataset from horizontally-merged data
Hi, I'm new to the forums and relatively new to Stata.
I have a problem which is that I have a table which is similar to below:
What I wish to have is a vertical table with each value i.e.
How do I do this?
I assume it involves append.
I have already renamed all the variables to be today, miss etcetera
I have a problem which is that I have a table which is similar to below:
todaya | missa | todayb | missb |
25 January 16 | 0.3 | 12 September 15 | 0.4 |
27 November 15 | 0.2 | 10 October 16 | 0.6 |
today | miss |
25 January 16 | 0.3 |
27 November 15 | 0.2 |
12 September 15 | 0.4 |
10 October 16 | 0.6 |
I assume it involves append.
I have already renamed all the variables to be today, miss etcetera
↧
Xtserial (N versus T) theoretical question
Hi all,
I had a question earlier that I posted about saving results from xtserial (from the Stata Journal), but I've now got a different and more theoretical question about xtserial so I thought I should make a second post (I hope this is not considered cross-posting, that was not my intention).
I've got panel data that I'm running an OLS regression on and clustering by id. I run a few different version of my base regression (regress logthpp logprevcpn, cluster(id)) narrowing my panel down and adding in dummy variables. At its largest, my sample is N=77 (as in 77 different groups I am clustering by, I have over 2000 observations), and T= 70. Since my N & T are relatively similar in size, is this something that I should worry about? In particular, as I narrow my sample, my T= 40 at it's lowest but my N = 10. Here when N is small and T is larger, should I worry about clustering?
Thanks for any thoughts.
I had a question earlier that I posted about saving results from xtserial (from the Stata Journal), but I've now got a different and more theoretical question about xtserial so I thought I should make a second post (I hope this is not considered cross-posting, that was not my intention).
I've got panel data that I'm running an OLS regression on and clustering by id. I run a few different version of my base regression (regress logthpp logprevcpn, cluster(id)) narrowing my panel down and adding in dummy variables. At its largest, my sample is N=77 (as in 77 different groups I am clustering by, I have over 2000 observations), and T= 70. Since my N & T are relatively similar in size, is this something that I should worry about? In particular, as I narrow my sample, my T= 40 at it's lowest but my N = 10. Here when N is small and T is larger, should I worry about clustering?
Thanks for any thoughts.
↧
x
Sorry, trying to delete and repost
↧
Counting unique observations
Hello all. Attempting to repost this question from earlier where I was having trouble with dataex. Here I have some test score data where I am hoping to be able to count how many times an ID shows up with a unique combination of my other 3 variables. So here, it would be 25, since there are 3 instances where an ID shows up with the exact same tut, test, and score values. I hope that makes sense and thank you for your help!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input byte id str1(tut test) byte score 1 "A" "A" 10 1 "A" "B" 10 1 "B" "B" 11 1 "B" "B" 11 1 "B" "B" 9 2 "B" "W" 5 2 "B" "W" 5 2 "D" "D" 14 2 "D" "D" 14 2 "D" "W" 14 2 "D" "F" 34 2 "F" "A" 23 3 "A" "A" 32 3 "E" "E" 20 3 "E" "F" 20 3 "F" "F" 20 3 "A" "F" 4 3 "S" "F" 3 3 "F" "F" 24 3 "S" "F" 25 4 "X" "A" 12 4 "Y" "A" 3 4 "W" "A" 3 4 "W" "W" 4 4 "W" "W" 10 4 "W" "W" 10 4 "A" "W" 11 4 "A" "A" 34 end
↧
↧
Select a subsample from a database
Hi, I have a database with 5000 obs of which I only need 600. I have used the following syntax:
set seed 10101
sample 600 if valid == 1, count (valid is a conditional variable for the selection)
However, running 'sample' removes all other observations. I want to create a variable that puts 1 to the selected observations and 0 to the unselected ones, instead of deleting them.
Any advice for it?
Thank you,
Adolfo Aramburu
set seed 10101
sample 600 if valid == 1, count (valid is a conditional variable for the selection)
However, running 'sample' removes all other observations. I want to create a variable that puts 1 to the selected observations and 0 to the unselected ones, instead of deleting them.
Any advice for it?
Thank you,
Adolfo Aramburu
↧
Stacked bar graph
Hi everyone,
I need help with stacked bar graph. I have this graph a I need add highest number to each column.
Array
When I have a simple stacked bar graph I used: twoway bar ....|| rbar ....|| rbar ....|| scatter .... But this graph I maked like: graph bar var1, over(a) over(b) over(c) stack
Thank you for help.
Paula
I need help with stacked bar graph. I have this graph a I need add highest number to each column.
Array
When I have a simple stacked bar graph I used: twoway bar ....|| rbar ....|| rbar ....|| scatter .... But this graph I maked like: graph bar var1, over(a) over(b) over(c) stack
Thank you for help.
Paula
↧
Interpreting Results of IV Probit VS. 2SLS
Hi everyone. I am using Stata to measure the impact of remittances (a dummy variable = 1 if a child is a member of a household that receives remittances, and 0 otherwise) on child education (also a dummy = 1 if a child is enrolled in school, and 0 otherwise). I used 2 models: 2SLS and IV Probit. I am having some problems interpreting my results and comparing them using the two methods. The coefficient of remittances = 0.23 in my 2SLS regression, and = 0.19 in my IV probit regression (average marginal effects). As far as I understood from the forum, I need to use percentage points when interpreting the results of the marginal effects. So I believe the interpretation of the average marginal effects would be: being a member of a household that receives remittances increases the probability of school enrollement by 19 percentage points. And in 2SLS, the interpretation would be: if the household receives remittances, then on average, the probability of school enrollment increases by 23%. Are these interpretations correct? If yes, I am not sure how to compare the results now that one is expressed in percent and the other in percentage points. Or Do I only compare the signs of the coefficient in this case?
Many thanks in advance.
Many thanks in advance.
↧
-mylabels- issue with invalid label specifier
Used the following code:
mylabels "1 Jun 2015" "1 Aug 2015" "1 Oct 2015" "1 Dec 2015" "1 Feb 2016" , myscale(clock("@", "DMY")) local(labels)
However when I try
xtline adhpc, overlay legend(off) xlabel(labels)
I get the error: invalid label specifier, : labels:
Why is this? I thought I used the correct local macro...
mylabels "1 Jun 2015" "1 Aug 2015" "1 Oct 2015" "1 Dec 2015" "1 Feb 2016" , myscale(clock("@", "DMY")) local(labels)
However when I try
xtline adhpc, overlay legend(off) xlabel(labels)
I get the error: invalid label specifier, : labels:
Why is this? I thought I used the correct local macro...
↧
↧
Creating a three way graph with an unbalanced panel
Hi there,
I am trying to create a graph that looks at a total firm value (y-axis), year (x-axis), with 4 lines for each firm category. There are multiple firms each year and it's an unbalanced panel.
I ideally would love something like the below (left graph). But my attempts to plot a line graph even without the categories provide me with the last graph.
Thank you.
Array
I am trying to create a graph that looks at a total firm value (y-axis), year (x-axis), with 4 lines for each firm category. There are multiple firms each year and it's an unbalanced panel.
I ideally would love something like the below (left graph). But my attempts to plot a line graph even without the categories provide me with the last graph.
Thank you.
Array
↧
foreach loop
Hello there,
I have two variables, income of individuals from year 1968 to 2017 and their ages for the same years. I want to create a variable which indicates income for specific age. Like income of each individual at the age of 40. I wrote the following loop but it assigned 2017's income for all ages.
generate labory40=.
foreach x of varlist labory1968-labory2017{
foreach y of varlist age1968-age2017{
replace labory40=`x' if `y'==40
}
}
generate labory41=.
foreach x of varlist labory1968-labory2017{
foreach y of varlist age1968-age2017{
replace labory41=`x' if `y'==41
}
}
.
.
.
Any help is highly appreciated,
I have two variables, income of individuals from year 1968 to 2017 and their ages for the same years. I want to create a variable which indicates income for specific age. Like income of each individual at the age of 40. I wrote the following loop but it assigned 2017's income for all ages.
generate labory40=.
foreach x of varlist labory1968-labory2017{
foreach y of varlist age1968-age2017{
replace labory40=`x' if `y'==40
}
}
generate labory41=.
foreach x of varlist labory1968-labory2017{
foreach y of varlist age1968-age2017{
replace labory41=`x' if `y'==41
}
}
.
.
.
Any help is highly appreciated,
↧
Mixed effects repeated measures model - not all variables measured at each time point.
Hi everyone, hope you're all well.
First post - so please let me know if I have missed anything from the posting FAQ,
I've got a question regarding the use of mixed effects ordinal logistic regression for repeated measures analysis where I have repeated measures of both the dependent variable at 5 time points (e.g. days 0, 28, 90, 180, 365) as well as multiple independent variables - where the independent variables may be measured at some/all of those time points e.g. only at Day 180 and Day 365. My dependent variable is a scale with categories from 0 - 6.
I am interested in utilising all of the repeated measures in my model, as well as being able to predict the outcome variable at different time points. However, if I include all repeated measures independent variables that are not measured at every time point, I've realised that naturally my model will omit the time variable (due to empty observations). It also has the curious effect of removing one of the cutoff points (e.g. should be 6 for a 7 point scale).
I was wondering if there was any way to utliise all the repeated measures variables and time variable? My first thought was imputation/some linear back extrapolation over time but I don't know if that would be particularly rigorous.
I have attached a sample of my code and corresponding output - my apologies if unclear. To simplify the code, I have included my repeated measures dependent variable (POS), the time variable (i.days) and 2 repeated measures independent variables: PHQ (measured at all 5 time points) and SF36 (measured only at 2 time points)
I've attempted to understand this by looking through the documentation and forum but to no avail. Hope you made it through the long post and I would be very grateful for any suggestions.
Code:
Output with independent variable measured at all 5 time points (PHQ):
Output with independent variable measured at only 2 time points (EQ5D5L):
Output with independent variable measured at only 2 time points (EQ5D5L) and at 5 time points (PHQ):
First post - so please let me know if I have missed anything from the posting FAQ,
I've got a question regarding the use of mixed effects ordinal logistic regression for repeated measures analysis where I have repeated measures of both the dependent variable at 5 time points (e.g. days 0, 28, 90, 180, 365) as well as multiple independent variables - where the independent variables may be measured at some/all of those time points e.g. only at Day 180 and Day 365. My dependent variable is a scale with categories from 0 - 6.
I am interested in utilising all of the repeated measures in my model, as well as being able to predict the outcome variable at different time points. However, if I include all repeated measures independent variables that are not measured at every time point, I've realised that naturally my model will omit the time variable (due to empty observations). It also has the curious effect of removing one of the cutoff points (e.g. should be 6 for a 7 point scale).
I was wondering if there was any way to utliise all the repeated measures variables and time variable? My first thought was imputation/some linear back extrapolation over time but I don't know if that would be particularly rigorous.
I have attached a sample of my code and corresponding output - my apologies if unclear. To simplify the code, I have included my repeated measures dependent variable (POS), the time variable (i.days) and 2 repeated measures independent variables: PHQ (measured at all 5 time points) and SF36 (measured only at 2 time points)
I've attempted to understand this by looking through the documentation and forum but to no avail. Hope you made it through the long post and I would be very grateful for any suggestions.
Code:
Code:
reshape long POS PHQ EQ5D5L, i(regono) j(days) xtset regono days meologit POS i.days PHQ meologit POS i.days EQ5D5L meologit POS i.days PHQ EQ5D5L
Code:
POS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- days | 28 | -1.149115 .0740449 -15.52 0.000 -1.294241 -1.00399 90 | -1.836681 .0763513 -24.06 0.000 -1.986327 -1.687035 180 | -2.288859 .0789507 -28.99 0.000 -2.443599 -2.134118 365 | -2.433143 .0835774 -29.11 0.000 -2.596952 -2.269335 | PHQ | .105064 .0063705 16.49 0.000 .092578 .11755 -------------+---------------------------------------------------------------- /cut1 | -4.185908 .0825713 -4.347745 -4.024071 /cut2 | -1.772225 .0666063 -1.902771 -1.641679 /cut3 | -.8859034 .064558 -1.012435 -.7593722 /cut4 | .3948111 .0623225 .2726613 .5169609 /cut5 | 2.54402 .0806856 2.385879 2.702161 /cut6 | 4.327243 .1559541 4.021578 4.632907
Code:
POS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- days | 0 | 0 (empty) 28 | 0 (empty) 90 | 0 (empty) 180 | -8.499786 .3015914 -28.18 0.000 -9.090894 -7.908677 365 | 0 (omitted) | EQ5D5L | -8.175158 .2752685 -29.70 0.000 -8.714675 -7.635642 -------------+---------------------------------------------------------------- /cut1 | -9.343137 .2739839 -9.880135 -8.806138 /cut2 | -6.464956 .243897 -6.942985 -5.986927 /cut3 | -5.132735 .2294906 -5.582528 -4.682942 /cut4 | -2.003746 .1904413 -2.377004 -1.630488 /cut5 | 2.974826 .3917165 2.207076 3.742576 ------------------------------------------------------------------------------
Code:
POS | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- days | 0 | 0 (empty) 28 | 0 (empty) 90 | 0 (empty) 180 | -8.465161 .3186536 -26.57 0.000 -9.089711 -7.840612 365 | 0 (omitted) | PHQ | .0048066 .0144241 0.33 0.739 -.023464 .0330773 EQ5D5L | -8.141514 .2928335 -27.80 0.000 -8.715457 -7.567571 -------------+---------------------------------------------------------------- /cut1 | -9.302017 .3001149 -9.890231 -8.713803 /cut2 | -6.422326 .2750666 -6.961447 -5.883206 /cut3 | -5.089812 .2628265 -5.604942 -4.574681 /cut4 | -1.966184 .221139 -2.399608 -1.532759 /cut5 | 3.007101 .4038256 2.215617 3.798584
↧