I am not familiar with this type of analysis, and so would greatly appreciate advice on the following.
I would like to develop and internally validate a model using a logistic regression analysis and a bootstrap sampling process.
Using the dataset, nlsw88, installed in Stata, as an example, I wrote the following code:
sysuse nlsw88, clear
cd "C:\Documents"
save nlsw88, replace
capture program drop mysim
program define mysim, rclass
simulate area=r(area), reps(10000): mysim
_pctile area, p(2.5 50 97.5)
ret list
* Gives the validation AUROC and accompanying 95% probability interval.
Does this way of going about the task make sense?
Thank you for your feedback.
Best wishes,
Miranda
I would like to develop and internally validate a model using a logistic regression analysis and a bootstrap sampling process.
Using the dataset, nlsw88, installed in Stata, as an example, I wrote the following code:
sysuse nlsw88, clear
cd "C:\Documents"
save nlsw88, replace
capture program drop mysim
program define mysim, rclass
use nlsw88, clearend
bsample
merge m:1 idcode using nlsw88
* Fit logistic regression model on the bootstrap sample
logit union south grade if _merge == 3
matrix b = e(b)
* test the model on the subjects that were not sampled
lroc union if _merge == 2, nograph beta(b)
return scalar area=r(area)
simulate area=r(area), reps(10000): mysim
_pctile area, p(2.5 50 97.5)
ret list
* Gives the validation AUROC and accompanying 95% probability interval.
Does this way of going about the task make sense?
Thank you for your feedback.
Best wishes,
Miranda