{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,6]],"date-time":"2026-01-06T01:22:10Z","timestamp":1767662530929,"version":"3.48.0"},"reference-count":30,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2017,8,1]],"date-time":"2017-08-01T00:00:00Z","timestamp":1501545600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Model Assisted Statistics and Applications"],"published-print":{"date-parts":[[2017,8]]},"abstract":"<jats:p>When prediction is a goal, validation utilizing data outside of the prediction effort is desirable. Typically, data is split into two parts: one for a development and one for validation. But this approach becomes less attractive when predicting uncommon events, as it substantially reduces power. When predicting uncommon events within a large prospective cohort study, we propose the use of a nested case-control design, which is an alternative to the full cohort analysis. By including all cases but only a subset of the non-cases, this design is expected to produce a result similar to the full cohort analysis. In our framework, variable selection is conducted and a prediction model is fit on those selected variables in the case-control cohort. Then, the fraction of true negative predictions (specificity) of the fitted prediction model in the case-control cohort is compared to that in the rest of the cohort (non-cases) for validation. In addition, we propose an iterative variable selection using random forest for missing data imputation, as well as a strategy for a valid classification. Our framework is illustrated with an application featuring high-dimensional variable selection in a large prospective cohort study.<\/jats:p>","DOI":"10.3233\/mas-170397","type":"journal-article","created":{"date-parts":[[2017,9,12]],"date-time":"2017-09-12T12:55:24Z","timestamp":1505220924000},"page":"227-237","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["A new framework for prediction and variable selection for uncommon events in a large prospective cohort study"],"prefix":"10.1177","volume":"12","author":[{"given":"Hye-Seung","family":"Lee","sequence":"first","affiliation":[{"name":"University of South Florida","place":["USA"]}]},{"given":"Jeffrey P.","family":"Krischer","sequence":"additional","affiliation":[{"name":"University of South Florida","place":["USA"]}]}],"member":"179","published-online":{"date-parts":[[2017,8]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1198\/jcgs.2009.07118"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1111\/rssc.12056"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0895-4356(03)00207-5"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1995.10476498"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2288-14-40"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/66.1.27"},{"key":"e_1_3_1_9_2","unstructured":"FleissJ. L.LevinB. & PaikM. C. (2003). Statistical methods for rates and proportions. 3rd edition. Wiley."},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v033.i01"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1023228925137"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1214\/08-AOAS169"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1002\/dmrr.2510"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0378-3758(00)00217-2"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1002\/sim.5937"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-9868.2010.00740.x"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/65.1.153"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1111\/biom.12113"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v058.i12"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1365-2362.2012.02727.x"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1093\/oxfordjournals.epirev.a017964"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1002\/sim.6351"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btr597"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1097\/EDE.0b013e3181c30fb2"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1037\/a0016973"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1177\/117693510700300025"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1996.tb02080.x"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1097\/00001648-199103000-00013"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1002\/sim.3943"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1214\/07-AOAS147"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1002\/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3"}],"container-title":["Model Assisted Statistics and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/MAS-170397","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/MAS-170397","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/MAS-170397","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,6]],"date-time":"2026-01-06T00:01:36Z","timestamp":1767657696000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/MAS-170397"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,8]]},"references-count":30,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2017,8]]}},"alternative-id":["10.3233\/MAS-170397"],"URL":"https:\/\/doi.org\/10.3233\/mas-170397","relation":{},"ISSN":["1574-1699","1875-9068"],"issn-type":[{"type":"print","value":"1574-1699"},{"type":"electronic","value":"1875-9068"}],"subject":[],"published":{"date-parts":[[2017,8]]}}}