{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T05:57:47Z","timestamp":1769579867428,"version":"3.49.0"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1009956","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,4,8]],"date-time":"2022-04-08T00:00:00Z","timestamp":1649376000000}}],"reference-count":48,"publisher":"Public Library of Science (PLoS)","issue":"3","license":[{"start":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T00:00:00Z","timestamp":1648512000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NSF IIS","award":["1850360"],"award-info":[{"award-number":["1850360"]}]},{"name":"NSF DBI IIBR","award":["2047631"],"award-info":[{"award-number":["2047631"]}]},{"name":"Showalter Young Investigator Award"},{"name":"Indiana University Simon Cancer Center"},{"name":"Precision Health Initiative of Indiana University"}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Metastatic cancer accounts for over 90% of all cancer deaths, and evaluations of metastasis potential are vital for minimizing the metastasis-associated mortality and achieving optimal clinical decision-making. Computational assessment of metastasis potential based on large-scale transcriptomic cancer data is challenging because metastasis events are not always clinically detectable. The under-diagnosis of metastasis events results in biased classification labels, and classification tools using biased labels may lead to inaccurate estimations of metastasis potential. This issue is further complicated by the unknown metastasis prevalence at the population level, the small number of confirmed metastasis cases, and the high dimensionality of the candidate molecular features. Our proposed algorithm, called<jats:bold>P<\/jats:bold>ositive and unlabeled<jats:bold>L<\/jats:bold>earning from<jats:bold>U<\/jats:bold>nbalanced cases and<jats:bold>S<\/jats:bold>parse structures (<jats:bold>PLUS<\/jats:bold>), is the first to use a positive and unlabeled learning framework to account for the under-detection of metastasis events in building a classifier. PLUS is specifically tailored for studying metastasis that deals with the unbalanced instance allocation as well as unknown metastasis prevalence, which are not considered by other methods. PLUS achieves superior performance on synthetic datasets compared with other state-of-the-art methods. Application of PLUS to The Cancer Genome Atlas Pan-Cancer gene expression data generated metastasis potential predictions that show good agreement with the clinical follow-up data, in addition to predictive genes that have been validated by independent single-cell RNA-sequencing datasets.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1009956","type":"journal-article","created":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T18:07:59Z","timestamp":1648577279000},"page":"e1009956","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":18,"title":["PLUS: Predicting cancer metastasis potential based on positive and unlabeled learning"],"prefix":"10.1371","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4493-6997","authenticated-orcid":true,"given":"Junyi","family":"Zhou","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6211-6215","authenticated-orcid":true,"given":"Xiaoyu","family":"Lu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8113-8252","authenticated-orcid":true,"given":"Wennan","family":"Chang","sequence":"additional","affiliation":[]},{"given":"Changlin","family":"Wan","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7987-9825","authenticated-orcid":true,"given":"Xiongbin","family":"Lu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9553-0925","authenticated-orcid":true,"given":"Chi","family":"Zhang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8645-848X","authenticated-orcid":true,"given":"Sha","family":"Cao","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2022,3,29]]},"reference":[{"issue":"4","key":"pcbi.1009956.ref001","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1038\/nrc.2016.25","article-title":"Targeting metastasis","volume":"16","author":"PS Steeg","year":"2016","journal-title":"Nature reviews cancer"},{"issue":"6","key":"pcbi.1009956.ref002","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1038\/nrc1886","article-title":"Metastasis: a question of life or death","volume":"6","author":"P Mehlen","year":"2006","journal-title":"Nature reviews cancer"},{"issue":"4","key":"pcbi.1009956.ref003","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1016\/j.cell.2006.11.001","article-title":"Cancer metastasis: building a framework","volume":"127","author":"GP Gupta","year":"2006","journal-title":"Cell"},{"issue":"8","key":"pcbi.1009956.ref004","doi-asserted-by":"crossref","first-page":"895","DOI":"10.1038\/nm1469","article-title":"Tumor metastasis: mechanistic insights and clinical challenges","volume":"12","author":"PS Steeg","year":"2006","journal-title":"Nature medicine"},{"issue":"14","key":"pcbi.1009956.ref005","doi-asserted-by":"crossref","first-page":"5649","DOI":"10.1158\/0008-5472.CAN-10-1040","article-title":"AACR centennial series: the biology of cancer metastasis: historical perspective","volume":"70","author":"JE Talmadge","year":"2010","journal-title":"Cancer research"},{"issue":"2","key":"pcbi.1009956.ref006","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1016\/j.cell.2011.09.024","article-title":"Tumor metastasis: molecular insights and evolving paradigms","volume":"147","author":"S Valastyan","year":"2011","journal-title":"Cell"},{"issue":"7667","key":"pcbi.1009956.ref007","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1038\/nature23306","article-title":"Integrative clinical genomics of metastatic cancer","volume":"548","author":"DR Robinson","year":"2017","journal-title":"Nature"},{"issue":"6900","key":"pcbi.1009956.ref008","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1038\/418823a","article-title":"Metastasis genes: a progression puzzle","volume":"418","author":"R Bernards","year":"2002","journal-title":"Nature"},{"key":"pcbi.1009956.ref009","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1146\/annurev-pathol-020117-044127","article-title":"Cancer metastasis: a reappraisal of its underlying mechanisms and their relevance to treatment","volume":"13","author":"N Riggi","year":"2018","journal-title":"Annual Review of Pathology: Mechanisms of Disease"},{"issue":"1","key":"pcbi.1009956.ref010","doi-asserted-by":"crossref","first-page":"728","DOI":"10.1038\/s41467-019-13825-8","article-title":"A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns","volume":"11","author":"W Jiao","year":"2020","journal-title":"Nature Communications"},{"issue":"1","key":"pcbi.1009956.ref011","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1186\/bcr562","article-title":"Expression profiling predicts outcome in breast cancer","volume":"5","author":"LJ van\u2019t Veer","year":"2002","journal-title":"Breast Cancer Research"},{"issue":"6871","key":"pcbi.1009956.ref012","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1038\/415530a","article-title":"Gene expression profiling predicts clinical outcome of breast cancer","volume":"415","author":"LJ Van\u2019t Veer","year":"2002","journal-title":"nature"},{"issue":"2","key":"pcbi.1009956.ref013","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pbio.0020007","article-title":"Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumors and wounds","volume":"2","author":"HY Chang","year":"2004","journal-title":"PLoS biology"},{"issue":"9369","key":"pcbi.1009956.ref014","doi-asserted-by":"crossref","first-page":"1590","DOI":"10.1016\/S0140-6736(03)13308-9","article-title":"Gene expression predictors of breast cancer outcomes","volume":"361","author":"E Huang","year":"2003","journal-title":"The Lancet"},{"issue":"9460","key":"pcbi.1009956.ref015","doi-asserted-by":"crossref","first-page":"671","DOI":"10.1016\/S0140-6736(05)17947-1","article-title":"Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer","volume":"365","author":"Y Wang","year":"2005","journal-title":"The Lancet"},{"issue":"14","key":"pcbi.1009956.ref016","doi-asserted-by":"crossref","first-page":"2192","DOI":"10.1038\/sj.onc.1206288","article-title":"Expression profiles of non-small cell lung cancers on cDNA microarrays: identification of genes for prediction of lymph-node metastasis and sensitivity to anti-cancer drugs","volume":"22","author":"T Kikuchi","year":"2003","journal-title":"Oncogene"},{"issue":"3","key":"pcbi.1009956.ref017","doi-asserted-by":"crossref","first-page":"734","DOI":"10.1158\/1078-0432.CCR-15-0143","article-title":"A composite gene expression signature optimizes prediction of colorectal cancer metastasis and outcome","volume":"22","author":"MJ Schell","year":"2016","journal-title":"Clinical Cancer Research"},{"key":"pcbi.1009956.ref018","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1016\/j.urology.2016.01.012","article-title":"Decipher genomic classifier measured on prostate biopsy predicts metastasis risk","volume":"90","author":"EA Klein","year":"2016","journal-title":"Urology"},{"issue":"1","key":"pcbi.1009956.ref019","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1158\/1055-9965.EPI-14-0544-T","article-title":"MicroRNA classifier and nomogram for metastasis prediction in colon cancer","volume":"24","author":"IJ Goossens-Beumer","year":"2015","journal-title":"Cancer Epidemiology and Prevention Biomarkers"},{"issue":"13","key":"pcbi.1009956.ref020","doi-asserted-by":"crossref","first-page":"1858","DOI":"10.1093\/bioinformatics\/btu128","article-title":"A personalized committee classification approach to improving prediction of breast cancer metastasis","volume":"30","author":"MJ Jahid","year":"2014","journal-title":"Bioinformatics"},{"issue":"6","key":"pcbi.1009956.ref021","doi-asserted-by":"crossref","first-page":"560","DOI":"10.1056\/NEJMoa052933","article-title":"Concordance among gene-expression\u2013based predictors for breast cancer","volume":"355","author":"C Fan","year":"2006","journal-title":"New England Journal of Medicine"},{"key":"pcbi.1009956.ref022","doi-asserted-by":"crossref","unstructured":"Elkan C, Noto K, editors. Learning classifiers from only positive and unlabeled data. Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining; 2008.","DOI":"10.1145\/1401890.1401920"},{"key":"pcbi.1009956.ref023","volume-title":"Diagnosis and Management of Metastatic Malignant Disease of Unknown Primary Origin","author":"National Collaborating Centre for C. National Institute for Health and Clinical Excellence: Guidance","year":"2010"},{"issue":"2","key":"pcbi.1009956.ref024","doi-asserted-by":"crossref","first-page":"554","DOI":"10.1111\/j.1541-0420.2008.01116.x","article-title":"Presence-only data and the EM algorithm","volume":"65","author":"G Ward","year":"2009","journal-title":"Biometrics"},{"issue":"1","key":"pcbi.1009956.ref025","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"R. Tibshirani","year":"1996","journal-title":"Journal of the Royal Statistical Society: Series B (Methodological)"},{"key":"pcbi.1009956.ref026","first-page":"1","article-title":"PULasso: High-dimensional variable selection with presence-only data","author":"H Song","year":"2019","journal-title":"Journal of the American Statistical Association"},{"issue":"5","key":"pcbi.1009956.ref027","doi-asserted-by":"crossref","first-page":"1932","DOI":"10.1109\/TCYB.2018.2816984","article-title":"AdaSampling for positive-unlabeled and label noise learning with bioinformatics applications","volume":"49","author":"P Yang","year":"2018","journal-title":"IEEE transactions on cybernetics"},{"key":"pcbi.1009956.ref028","first-page":"1100","article-title":"Cox\u2019s regression model for counting processes: a large sample study","author":"PK Andersen","year":"1982","journal-title":"The annals of statistics"},{"key":"pcbi.1009956.ref029","first-page":"65","article-title":"A simple sequentially rejective multiple test procedure","author":"S. Holm","year":"1979","journal-title":"Scandinavian journal of statistics"},{"issue":"D1","key":"pcbi.1009956.ref030","doi-asserted-by":"crossref","first-page":"D1049","DOI":"10.1093\/nar\/gku1179","article-title":"Gene ontology consortium: going forward","volume":"43","author":"GO Consortium","year":"2015","journal-title":"Nucleic acids research"},{"issue":"15","key":"pcbi.1009956.ref031","doi-asserted-by":"crossref","first-page":"2440","DOI":"10.4161\/cc.10.15.16870","article-title":"Hydrogen peroxide fuels aging, inflammation, cancer metabolism and metastasis: the seed and soil also needs \"fertilizer\"","volume":"10","author":"MP Lisanti","year":"2011","journal-title":"Cell Cycle"},{"issue":"2","key":"pcbi.1009956.ref032","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1007\/s10555-010-9225-4","article-title":"Metastasis: cancer cell\u2019s escape from oxidative stress","volume":"29","author":"G Pani","year":"2010","journal-title":"Cancer Metastasis Rev"},{"issue":"8","key":"pcbi.1009956.ref033","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1038\/nrc3105","article-title":"Calcium in tumour metastasis: new roles for known actors","volume":"11","author":"N Prevarskaya","year":"2011","journal-title":"Nature Reviews Cancer"},{"key":"pcbi.1009956.ref034","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1016\/bs.acr.2016.05.005","article-title":"Chapter Eight\u2014Cytokine Regulation of Metastasis and Tumorigenicity","volume":"132","author":"M Yao","year":"2016","journal-title":"Advances in Cancer Research"},{"issue":"7","key":"pcbi.1009956.ref035","doi-asserted-by":"crossref","first-page":"1611","DOI":"10.1016\/j.cell.2017.10.044","article-title":"Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer","volume":"171","author":"SV Puram","year":"2017","journal-title":"Cell"},{"issue":"1","key":"pcbi.1009956.ref036","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/ncomms15081","article-title":"Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer","volume":"8","author":"W Chung","year":"2017","journal-title":"Nature communications"},{"issue":"1","key":"pcbi.1009956.ref037","first-page":"1","article-title":"Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma","volume":"11","author":"N Kim","year":"2020","journal-title":"Nature communications"},{"key":"pcbi.1009956.ref038","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: A graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"PJ Rousseeuw","year":"1987","journal-title":"Journal of Computational and Applied Mathematics"},{"issue":"5","key":"pcbi.1009956.ref039","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1097\/01.SLA.0000064361.12265.9A","article-title":"Extracapsular extension of the sentinel lymph node metastasis: a predictor of nonsentinel node tumor burden","volume":"237","author":"KB Stitzenberg","year":"2003","journal-title":"Annals of surgery"},{"issue":"1","key":"pcbi.1009956.ref040","first-page":"187","article-title":"Sparse algorithms are not stable: A no-free-lunch theorem","volume":"34","author":"H Xu","year":"2011","journal-title":"IEEE transactions on pattern analysis and machine intelligence"},{"issue":"7781","key":"pcbi.1009956.ref041","doi-asserted-by":"crossref","first-page":"210","DOI":"10.1038\/s41586-019-1689-y","article-title":"Pan-cancer whole-genome analyses of metastatic solid tumours","volume":"575","author":"P Priestley","year":"2019","journal-title":"Nature"},{"issue":"1","key":"pcbi.1009956.ref042","doi-asserted-by":"crossref","first-page":"e1005911","DOI":"10.1371\/journal.pcbi.1005911","article-title":"Integration of pan-cancer transcriptomics with RPPA proteomics reveals mechanisms of epithelial-mesenchymal transition","volume":"14","author":"S Koplev","year":"2018","journal-title":"PLoS computational biology"},{"issue":"3","key":"pcbi.1009956.ref043","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1158\/1078-0432.CCR-15-0876","article-title":"A Patient-Derived, Pan-Cancer EMT Signature Identifies Global Molecular Alterations and Immune Target Enrichment Following Epithelial-to-Mesenchymal Transition","volume":"22","author":"MP Mak","year":"2016","journal-title":"Clinical Cancer Research"},{"key":"pcbi.1009956.ref044","doi-asserted-by":"crossref","unstructured":"Chen T, Guestrin C, editors. Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016.","DOI":"10.1145\/2939672.2939785"},{"key":"pcbi.1009956.ref045","unstructured":"Ho TK, editor Random decision forests. Proceedings of 3rd international conference on document analysis and recognition; 1995: IEEE."},{"issue":"2","key":"pcbi.1009956.ref046","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1016\/j.cell.2018.03.022","article-title":"Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer","volume":"173","author":"KA Hoadley","year":"2018","journal-title":"Cell"},{"issue":"2","key":"pcbi.1009956.ref047","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1016\/j.cell.2018.02.052","article-title":"An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics","volume":"173","author":"J Liu","year":"2018","journal-title":"Cell"},{"issue":"7","key":"pcbi.1009956.ref048","doi-asserted-by":"crossref","first-page":"1888","DOI":"10.1016\/j.cell.2019.05.031","article-title":"Comprehensive integration of single-cell data","volume":"177","author":"T Stuart","year":"2019","journal-title":"Cell"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1009956","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,4,8]],"date-time":"2022-04-08T00:00:00Z","timestamp":1649376000000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009956","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,21]],"date-time":"2024-09-21T06:19:29Z","timestamp":1726899569000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009956"}},"subtitle":[],"editor":[{"given":"Jie","family":"Liu","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,3,29]]},"references-count":48,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,3,29]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1009956","relation":{"new_version":[{"id-type":"doi","id":"10.1371\/journal.pcbi.1009956","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,29]]}}}