{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T14:12:46Z","timestamp":1740147166033,"version":"3.37.3"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2020,6,9]],"date-time":"2020-06-09T00:00:00Z","timestamp":1591660800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,6,9]],"date-time":"2020-06-09T00:00:00Z","timestamp":1591660800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["SFB1074","Project Z1"],"award-info":[{"award-number":["SFB1074","Project Z1"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["GRK 2254 HEIST"],"award-info":[{"award-number":["GRK 2254 HEIST"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["e:Med","conFirm"],"award-info":[{"award-number":["e:Med","conFirm"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["id 01ZX1708C"],"award-info":[{"award-number":["id 01ZX1708C"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Adv Data Anal Classif"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Data-driven algorithms stand and fall with the availability and quality of existing data sources. Both can be limited in high-dimensional settings (<jats:inline-formula><jats:alternatives><jats:tex-math>$$n \\gg m$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>n<\/mml:mi>\n                    <mml:mo>\u226b<\/mml:mo>\n                    <mml:mi>m<\/mml:mi>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>). For example, supervised learning algorithms designed for molecular pheno- or genotyping are restricted to samples of the corresponding diagnostic classes. Samples of other related entities, such as arise in differential diagnosis, are usually not utilized in this learning scheme. Nevertheless, they might provide domain knowledge on the background or context of the original diagnostic task. In this work, we discuss the possibility of incorporating samples of foreign classes in the training of diagnostic classification models that can be related to the task of differential diagnosis. Especially in heterogeneous data collections comprising multiple diagnostic categories, the foreign ones can change the magnitude of available samples. More precisely, we utilize this information for the internal feature selection process of diagnostic models. We propose the use of chained correlations of original and foreign diagnostic classes. This method allows the detection of intermediate foreign classes by evaluating the correlation between class labels and features for each pair of original and foreign categories. Interestingly, this criterion does not require direct comparisons of the initial diagnostic groups and therefore, might be suitable for settings with restricted data access.<\/jats:p>","DOI":"10.1007\/s11634-020-00397-5","type":"journal-article","created":{"date-parts":[[2020,6,9]],"date-time":"2020-06-09T06:23:03Z","timestamp":1591683783000},"page":"871-884","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Chained correlations for feature selection"],"prefix":"10.1007","volume":"14","author":[{"given":"Ludwig","family":"Lausser","sequence":"first","affiliation":[]},{"given":"Robin","family":"Szekely","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4759-5254","authenticated-orcid":false,"given":"Hans A.","family":"Kestler","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,6,9]]},"reference":[{"key":"397_CR1","volume-title":"Dynamic programming","author":"R Bellman","year":"1957","unstructured":"Bellman R (1957) Dynamic programming. Princeton University Press, Princeton"},{"issue":"40","key":"397_CR2","doi-asserted-by":"publisher","first-page":"15605","DOI":"10.1073\/pnas.0806883105","volume":"105","author":"NC Berchtold","year":"2008","unstructured":"Berchtold NC, Cribbs DH, Coleman PD, Rogers J, Head E, Kim R, Beach T, Miller C, Troncoso J, Trojanowski JQ, Zielke HR, Cotman CW (2008) Gene expression changes in the course of normal brain aging are sexually dimorphic. Proc Natl Acad Sci USA 105(40):15605\u201315610","journal-title":"Proc Natl Acad Sci USA"},{"key":"397_CR3","unstructured":"Bittner M (2005) Expression project for oncology (expO). National Center for Biotechnology Information"},{"issue":"1","key":"397_CR4","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L (2001) Random forests. Mach Learn 45(1):5\u201332","journal-title":"Mach Learn"},{"key":"397_CR5","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-20192-9","volume-title":"Statistics for high-dimensional data","author":"P B\u00fchlmann","year":"2011","unstructured":"B\u00fchlmann P, van de Geer S (2011) Statistics for high-dimensional data. Springer Series in Statistics, Springer, Heidelberg"},{"key":"397_CR6","first-page":"285","volume-title":"Data analysis","author":"A Burkovski","year":"2014","unstructured":"Burkovski A, Lausser L, Kraus J, Kestler H (2014) Rank aggregation for candidate gene identification, machine learning and knowledge discovery. In: Spiliopoulou M, Schmidt-Thieme L, Janning R (eds) Data analysis. Springer International Publishing, Cham, pp 285\u2013293"},{"issue":"1","key":"397_CR7","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1023\/A:1007379606734","volume":"28","author":"R Caruana","year":"1997","unstructured":"Caruana R (1997) Multitask learning. Mach Learn 28(1):41\u201375","journal-title":"Mach Learn"},{"key":"397_CR8","volume-title":"Semi-supervised learning","author":"O Chapelle","year":"2010","unstructured":"Chapelle O, Sch\u00f6lkopf B, Zien A (2010) Semi-supervised learning, 1st edn. The MIT Press, Cambridge","edition":"1"},{"key":"397_CR9","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1007\/978-3-540-69507-3_4","volume-title":"SOFSEM 2007: theory and practice of computer science","author":"Y Chevaleyre","year":"2007","unstructured":"Chevaleyre Y, Endriss U, Lang J, Maudet N (2007) A short introduction to computational social choice. In: van Leeuwen J, Italiano G, van der Hoek W, Meinel C, Sack H, Pl\u00e1\u0161il F (eds) SOFSEM 2007: theory and practice of computer science. Springer, Berlin, Heidelberg, pp 51\u201369"},{"issue":"3","key":"397_CR10","doi-asserted-by":"publisher","first-page":"326","DOI":"10.1109\/PGEC.1965.264137","volume":"14","author":"TM Cover","year":"1965","unstructured":"Cover TM (1965) Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Trans Electron Comput 14(3):326\u2013334","journal-title":"IEEE Trans Electron Comput"},{"key":"397_CR11","volume-title":"Multi-objective optimization using evolutionary algorithms","author":"K Deb","year":"2001","unstructured":"Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, Hoboken"},{"key":"397_CR12","doi-asserted-by":"crossref","unstructured":"Fix E, Hodges JL (1951) Discriminatory analysis: nonparametric discrimination: consistency properties. In: Technical reports project 21-49-004, report number 4. USAF School of Aviation Medicine, Randolf Field, Texas","DOI":"10.1037\/e471672008-001"},{"issue":"7\u20139","key":"397_CR13","doi-asserted-by":"publisher","first-page":"1276","DOI":"10.1016\/j.neucom.2006.11.019","volume":"70","author":"D Fran\u00e7ois","year":"2007","unstructured":"Fran\u00e7ois D, Rossi F, Wertz V, Verleysen M (2007) Resampling methods for parameter-free and robust feature selection with mutual information. Neurocomputing 70(7\u20139):1276\u20131288","journal-title":"Neurocomputing"},{"issue":"7","key":"397_CR14","doi-asserted-by":"publisher","first-page":"2697","DOI":"10.1158\/0008-5472.CAN-10-3588","volume":"71","author":"RM Gobble","year":"2011","unstructured":"Gobble RM, Qin LX, Brill ER, Angeles CV, Ugras S, O\u2019Connor RB, Moraco NH, DeCarolis PL, Antonescu C, Singer S (2011) Expression profiling of liposarcoma yields a multigene predictor of patient outcome and identifies genes that contribute to liposarcomagenesis. Cancer Res 71(7):2697\u20132705","journal-title":"Cancer Res"},{"issue":"Mar","key":"397_CR15","first-page":"1157","volume":"3","author":"I Guyon","year":"2003","unstructured":"Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157\u20131182","journal-title":"J Mach Learn Res"},{"issue":"15","key":"397_CR16","doi-asserted-by":"publisher","first-page":"2529","DOI":"10.1200\/JCO.2009.23.4732","volume":"28","author":"T Haferlach","year":"2010","unstructured":"Haferlach T, Kohlmann A, Wieczorek L, Basso G, Kronnie GT, B\u00e9n\u00e9 MC, Vos JD, Hern\u00e1ndez JM, Hofmann WK, Mills KI, Gilkes A, Chiaretti S, Shurtleff SA, Kipps TJ, Rassenti LZ, Yeoh AE, Papenhausen PR, Liu WM, Williams PM, Fo\u00e0 R (2010) Clinical utility of microarray-based gene expression profiling in the diagnosis and subclassification of leukemia: report from the international microarray innovations in leukemia study group. J Clin Oncol 28(15):2529\u20132537","journal-title":"J Clin Oncol"},{"key":"397_CR17","unstructured":"Hinneburg A, Aggarwal C, Keim D (2000) What is the nearest neighbor in high dimensional spaces? In: Proceedings of the 26th international conference on very large data bases, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 506\u2013515"},{"key":"397_CR18","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511921803","volume-title":"Evaluating learning algorithms: a classification perspective","author":"N Japkowicz","year":"2011","unstructured":"Japkowicz N, Shah M (2011) Evaluating learning algorithms: a classification perspective. Cambridge University Press, New York"},{"issue":"16","key":"397_CR19","doi-asserted-by":"publisher","first-page":"5730","DOI":"10.1158\/1078-0432.CCR-04-2225","volume":"11","author":"J Jones","year":"2005","unstructured":"Jones J, Otu H, Spentzos D, Kolia S, Inan M, Beecken WD, Fellbaum C, Gu X, Joseph M, Pantuck AJ, Jonas D, Libermann TA (2005) Gene signatures of progression and metastasis in renal cell cancer. Clin Cancer Res 11(16):5730\u20135739","journal-title":"Clin Cancer Res"},{"key":"397_CR20","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/3897.001.0001","volume-title":"An introduction to computational learning theory","author":"M Kearns","year":"1994","unstructured":"Kearns M, Vazirani U (1994) An introduction to computational learning theory. MIT Press, Cambridge"},{"issue":"2","key":"397_CR21","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1016\/j.alcohol.2007.03.003","volume":"41","author":"MW Kimpel","year":"2007","unstructured":"Kimpel MW, Strother WN, McClintick JN, Carr LG, Liang T, Edenberg HJ, McBride WJ (2007) Functional gene expression differences between inbred alcohol-preferring and non-preferring rats in five brain regions. Alcohol 41(2):95\u2013132","journal-title":"Alcohol"},{"issue":"3","key":"397_CR22","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1007\/s41060-018-0095-0","volume":"6","author":"J Kraus","year":"2018","unstructured":"Kraus J, Lausser L, Kuhn P, Jobst F, Bock M, Halanke C, Hummel M, Heuschmann P, Kestler HA (2018) Big data and precision medicine: challenges and strategies with healthcare data. Int J Data Sci Anal 6(3):241\u2013249","journal-title":"Int J Data Sci Anal"},{"key":"397_CR23","doi-asserted-by":"crossref","unstructured":"Lattke R, Lausser L, M\u00fcssel C, Kestler HA (2015) Detecting ordinal class structures. In: Schwenker F, Roli F, Kittler J (eds) Multiple classifier systems, MCS 2015. Lecture notes in computer science, vol 9132, pp 100\u2013111. Springer, Cham","DOI":"10.1007\/978-3-319-20248-8_9"},{"key":"397_CR24","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1016\/j.patrec.2013.03.027","volume":"37","author":"L Lausser","year":"2014","unstructured":"Lausser L, Schmid F, Schmid M, Kestler HA (2014) Unlabeling data can improve classification accuracy. Pattern Recogn Lett 37:15\u201323","journal-title":"Pattern Recogn Lett"},{"issue":"1","key":"397_CR25","first-page":"1","volume":"1","author":"L Lausser","year":"2016","unstructured":"Lausser L, Schmid F, Platzer M, Sillanp\u00e4\u00e4 MJ, Kestler HA (2016a) Semantic multi-classifier systems for the analysis of gene expression profiles. Arch Data Sci Ser A 1(1):1\u201319 (Online First)","journal-title":"Arch Data Sci Ser A"},{"key":"397_CR26","first-page":"1","volume":"12","author":"L Lausser","year":"2016","unstructured":"Lausser L, Schmid F, Schirra LR, Wilhelm A, Kestler H (2016b) Rank-based classifiers for extremely high-dimensional gene expression data. Adv Data Anal Classif 12:1\u201320","journal-title":"Adv Data Anal Classif"},{"key":"397_CR27","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1007\/978-3-319-99978-4_5","volume-title":"Artificial neural networks in pattern recognition","author":"L Lausser","year":"2018","unstructured":"Lausser L, Szekely R, Kessler V, Schwenker F, Kestler HA (2018a) Selecting features from foreign classes. In: Pancioni L, Schwenker F, Trentin E (eds) Artificial neural networks in pattern recognition. Springer International Publishing, Cham, pp 66\u201377"},{"issue":"2","key":"397_CR28","doi-asserted-by":"publisher","first-page":"863","DOI":"10.1007\/s11063-017-9706-3","volume":"48","author":"L Lausser","year":"2018","unstructured":"Lausser L, Szekely R, Schirra LR, Kestler HA (2018b) The influence of multi-class feature selection on the prediction of diagnostic phenotypes. Neural Process Lett 48(2):863\u2013880","journal-title":"Neural Process Lett"},{"issue":"5","key":"397_CR29","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v046.i05","volume":"46","author":"C M\u00fcssel","year":"2012","unstructured":"M\u00fcssel C, Lausser L, Maucher M, Kestler HA (2012) Multi-objective parameter selection for classifiers. J Stat Softw 46(5):1\u201327","journal-title":"J Stat Softw"},{"issue":"10","key":"397_CR30","doi-asserted-by":"publisher","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","volume":"22","author":"SJ Pan","year":"2010","unstructured":"Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345\u20131359","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"7","key":"397_CR31","doi-asserted-by":"publisher","first-page":"1878","DOI":"10.1158\/1535-7163.MCT-09-0016","volume":"8","author":"TD Pfister","year":"2009","unstructured":"Pfister TD, Reinhold WC, Agama K, Gupta S, Khin SA, Kinders RJ, Parchment RE, Tomaszewski JE, Doroshow JH, Pommier Y (2009) Topoisomerase I levels in the NCI-60 cancer cell line panel determined by validated ELISA and microarray analysis and correlation with indenoisoquinoline sensitivity. Mol Cancer Ther 8(7):1878\u20131884","journal-title":"Mol Cancer Ther"},{"issue":"17","key":"397_CR32","doi-asserted-by":"publisher","first-page":"7131","DOI":"10.1073\/pnas.0902232106","volume":"106","author":"M Sheffer","year":"2009","unstructured":"Sheffer M, Bacolod MD, Zuk O, Giardina SF, Pincas H, Barany F, Paty PB, Gerald WL, Notterman DA, Domany E (2009) Association of survival and disease progression with chromosomal instability: a genomic exploration of colorectal cancer. Proc Nat Acad Sci 106(17):7131\u20137136","journal-title":"Proc Nat Acad Sci"},{"key":"397_CR33","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1016\/j.ebiom.2016.08.037","volume":"12","author":"S Taudien","year":"2016","unstructured":"Taudien S, Lausser L, Giamarellos-Bourboulis EJ, Sponholz C, F S, Felder M, Schirra LR, Schmid F, Gogos C, S G, Petersen BS, Franke A, Lieb W, Huse K, Zipfel PF, Kurzai O, Moepps B, Gierschik P, Bauer M, Scherag A, Kestler HA, Platzer M (2016) Genetic factors of the disease course after sepsis: rare deleterious variants are predictive. EBioMedicine 12:227\u2013238","journal-title":"EBioMedicine"},{"key":"397_CR34","volume-title":"Statistical learning theory","author":"VN Vapnik","year":"1998","unstructured":"Vapnik VN (1998) Statistical learning theory. Wiley, New York"},{"issue":"1","key":"397_CR35","doi-asserted-by":"publisher","first-page":"99","DOI":"10.3390\/e21010099","volume":"21","author":"S Yu","year":"2019","unstructured":"Yu S, Pr\u00edncipe J (2019) Simple stopping criteria for information theoretic feature selection. Entropy 21(1):99","journal-title":"Entropy"}],"container-title":["Advances in Data Analysis and Classification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-020-00397-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11634-020-00397-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-020-00397-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,6,9]],"date-time":"2021-06-09T06:34:49Z","timestamp":1623220489000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11634-020-00397-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,9]]},"references-count":35,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["397"],"URL":"https:\/\/doi.org\/10.1007\/s11634-020-00397-5","relation":{},"ISSN":["1862-5347","1862-5355"],"issn-type":[{"type":"print","value":"1862-5347"},{"type":"electronic","value":"1862-5355"}],"subject":[],"published":{"date-parts":[[2020,6,9]]},"assertion":[{"value":"30 June 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 January 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 April 2020","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 June 2020","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}