{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,1]],"date-time":"2025-08-01T04:03:39Z","timestamp":1754021019846},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2022,9,2]],"date-time":"2022-09-02T00:00:00Z","timestamp":1662076800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,2]],"date-time":"2022-09-02T00:00:00Z","timestamp":1662076800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Helmut-Schmidt-Universit\u00e4t Universit\u00e4t der Bundeswehr Hamburg"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Adv Data Anal Classif"],"published-print":{"date-parts":[[2023,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We consider nonparametric prediction with multiple covariates, in particular categorical or functional predictors, or a mixture of both. The method proposed bases on an extension of the Nadaraya-Watson estimator where a kernel function is applied on a linear combination of distance measures each calculated on single covariates, with weights being estimated from the training data. The dependent variable can be categorical (binary or multi-class) or continuous, thus we consider both classification and regression problems. The methodology presented is illustrated and evaluated on artificial and real world data. Particularly it is observed that prediction accuracy can be increased, and irrelevant, noise variables can be identified\/removed by \u2018downgrading\u2019 the corresponding distance measures in a completely data-driven way.<\/jats:p>","DOI":"10.1007\/s11634-022-00513-7","type":"journal-article","created":{"date-parts":[[2022,9,2]],"date-time":"2022-09-02T07:02:30Z","timestamp":1662102150000},"page":"519-543","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Nonparametric regression and classification with functional, categorical, and mixed covariates"],"prefix":"10.1007","volume":"17","author":[{"given":"Leonie","family":"Selk","sequence":"first","affiliation":[]},{"given":"Jan","family":"Gertheiss","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,9,2]]},"reference":[{"key":"513_CR1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jmva.2021.104861","volume":"188","author":"G Aneiros","year":"2022","unstructured":"Aneiros G, Novo S, Vieu P (2022) Variable selection in functional regression models: A review. J of Multivariate Anal 188:104861","journal-title":"J of Multivariate Anal"},{"issue":"1","key":"513_CR2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.0030002","volume":"3","author":"SE Baranzini","year":"2004","unstructured":"Baranzini SE, Mousavi P, Rio J, Caillier SJ, Stillman A, Villoslada P, Wyatt MM, Comabella M, Greller LD, Somogyi R, Montalban X, Oksenberg JR (2004) Transcription-based prediction of response to IFN$$\\beta $$ using supervised computational methods. PLoS Biol 3(1):e2","journal-title":"PLoS Biol"},{"issue":"1","key":"513_CR3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1175\/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2","volume":"78","author":"GW Brier","year":"1950","unstructured":"Brier GW (1950) Verification of forecasts expressed in terms of probability. Monthly Weather Rev 78(1):1\u20133","journal-title":"Monthly Weather Rev"},{"key":"513_CR4","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez-Fontelo A, Henninger F, Kieslich PJ, Kreuter F, Greven S (2021) Predicting question difficulty in web surveys: A machine learning approach based on mouse movement features. Social Science Computer Review pp 1\u201322","DOI":"10.1177\/08944393211032950"},{"key":"513_CR5","volume-title":"Nonparametric Functional Data Analysis","author":"F Ferraty","year":"2006","unstructured":"Ferraty F, Vieu P (2006) Nonparametric Functional Data Analysis. Springer Series in Statistics, Springer, New York"},{"key":"513_CR6","doi-asserted-by":"publisher","first-page":"186","DOI":"10.1016\/j.chemolab.2015.04.019","volume":"146","author":"K Fuchs","year":"2015","unstructured":"Fuchs K, Gertheiss J, Tutz G (2015) Nearest neighbor ensembles for functional data with interpretable feature selection. Chemometrics and Intell Laboratory Syst 146:186\u2013197","journal-title":"Chemometrics and Intell Laboratory Syst"},{"issue":"477","key":"513_CR7","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1198\/016214506000001437","volume":"102","author":"T Gneiting","year":"2007","unstructured":"Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J of the Am Statistical Assoc 102(477):359\u2013378","journal-title":"J of the Am Statistical Assoc"},{"key":"513_CR8","unstructured":"Goldsmith J, Scheipl F, Huang L, Wrobel J, Di C, Gellar J, Harezlak J, McLean MW, Swihart B, Xiao L, Crainiceanu C, Reiss PT (2021) refund: Regression with Functional Data. https:\/\/CRAN.R-project.org\/package=refund, r package version 0.1-24"},{"key":"513_CR9","doi-asserted-by":"publisher","first-page":"2305","DOI":"10.1016\/j.eswa.2014.11.007","volume":"42","author":"T G\u00f3recki","year":"2015","unstructured":"G\u00f3recki T, \u0141uczak M (2015) Multivariate time series classification with parametric derivative dynamic time warping. Expert Syst with Appl 42:2305\u20132312","journal-title":"Expert Syst with Appl"},{"key":"513_CR10","unstructured":"G\u00f3recki T, Smaga \u0141 (2017) mfds: Multivariate Functional Data Sets. Adam Mickiewicz University, Poznan, https:\/\/github.com\/Halmaris\/mfds, r package version 0.1.0"},{"key":"513_CR11","doi-asserted-by":"publisher","first-page":"827","DOI":"10.1007\/s11634-015-0227-5","volume":"12","author":"A Gul","year":"2018","unstructured":"Gul A, Perperoglou A, Khan Z, Mahmoud O, Miftahuddin M, Adler W, Lausen B (2018) Ensemble of a subset of kNN classifiers. Adv in Data Anal and Classif 12:827\u2013840","journal-title":"Adv in Data Anal and Classif"},{"issue":"4","key":"513_CR12","doi-asserted-by":"publisher","first-page":"784","DOI":"10.1162\/rest.89.4.784","volume":"89","author":"P Hall","year":"2007","unstructured":"Hall P, Li Q, Racine JS (2007) Nonparametric estimation of regression functions in the presence of irrelevant regressors. The Rev of Econ and Statistics 89(4):784\u2013789","journal-title":"The Rev of Econ and Statistics"},{"key":"513_CR13","doi-asserted-by":"crossref","unstructured":"H\u00e4rdle W, M\u00fcller M (2000) Multivariate and semiparametric kernel regression. In: Schimek MG (ed) Smoothing and Regression: Approaches, Computation, and Application. Wiley Series in Probability and Statistics, Wiley, New York (chap\u00a012)","DOI":"10.1002\/9781118150658.ch12"},{"key":"513_CR14","volume-title":"The Elements of Statistical Learning-Data Mining, Inference, and Prediction","author":"T Hastie","year":"2009","unstructured":"Hastie T, Tibshiranie R, Friedman J (2009) The Elements of Statistical Learning-Data Mining, Inference, and Prediction, 2nd edn. Springer Series in Statistics, Springer, New York","edition":"2"},{"key":"513_CR15","first-page":"258","volume":"18","author":"O Hirose","year":"2007","unstructured":"Hirose O, Yoshida R, Yamaguchi R, Imoto S, Higuchi T, Miyano S (2007) Clustering samples characterized by time course gene expression profiles using the mixture of state space models. Genome Inf 18:258\u2013266","journal-title":"Genome Inf"},{"issue":"2","key":"513_CR16","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/biostatistics\/kxv037","volume":"17","author":"M Kayano","year":"2016","unstructured":"Kayano M, Matsui H, Yamaguchi R, Imoto S, Miyano S (2016) Gene set differential analysis of time course expression profiles via sparse estimation in functional logistic model with application to time-dependent biomarker detection. Biostat 17(2):235\u2013248","journal-title":"Biostat"},{"key":"513_CR17","doi-asserted-by":"crossref","unstructured":"Kokoszka P, Reimherr M (2017) Introduction to Functional Data Analysis. Texts in Statistical Science. CRC Press, New York","DOI":"10.1201\/9781315117416"},{"key":"513_CR18","doi-asserted-by":"crossref","unstructured":"Koolagudi SG, Rastogi D, Rao KS (2012) Identification of language using mel-frequency cepstral coefficients (mfcc). In: Rajesh R, Ganesh K, Koh SCL (eds) Procedia Engineering 38: International Conference on Modelling, Optimisation and Computing (ICMOC). Elsevier, Amsterdam, pp 3391\u20133398","DOI":"10.1016\/j.proeng.2012.06.392"},{"issue":"3","key":"513_CR19","first-page":"433","volume":"18","author":"M Krzy\u015bko","year":"2017","unstructured":"Krzy\u015bko M, Smaga \u0141 (2017) An application of functional multivariate regression model to multiclass classification. Statistics in Trans New Ser 18(3):433\u2013442","journal-title":"Statistics in Trans New Ser"},{"key":"513_CR20","unstructured":"Liaw A, Wiener M (2002) Classification and regression by randomforest. R News 2(3):18\u201322. https:\/\/CRAN.R-project.org\/doc\/Rnews\/"},{"key":"513_CR21","doi-asserted-by":"publisher","first-page":"773","DOI":"10.1007\/s11634-018-0343-0","volume":"13","author":"AM Mbina","year":"2019","unstructured":"Mbina AM, Nkiet GM, Obiang FE (2019) Variable selection in discriminant analysis for mixed continuous-binary variables and several groups. Adv in Data Anal and Classif 13:773\u2013795","journal-title":"Adv in Data Anal and Classif"},{"key":"513_CR22","unstructured":"M\u00f6ller A, Gertheiss J (2018) A classification tree for functional data. In: Proceedings of the 33th International Workshop on Statistical Modelling. Statistical Modelling Society, pp 219\u2013224"},{"key":"513_CR23","doi-asserted-by":"publisher","first-page":"186","DOI":"10.1137\/1110024","volume":"10","author":"EA Nadaraya","year":"1964","unstructured":"Nadaraya EA (1964) On non-parametric estimates of density functions and regression curves. Theory of Probab and its Appl 10:186\u2013190","journal-title":"Theory of Probab and its Appl"},{"key":"513_CR24","unstructured":"R Core Team (2020) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https:\/\/www.R-project.org\/"},{"key":"513_CR25","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1016\/S0304-4076(03)00157-X","volume":"119","author":"JS Racine","year":"2004","unstructured":"Racine JS, Li Q (2004) Nonparametric estimation of regression functions with both categorical and continuous data. J of Econom 119:99\u2013130","journal-title":"J of Econom"},{"key":"513_CR26","first-page":"1","volume":"25","author":"JS Racine","year":"2006","unstructured":"Racine JS, Hart JD, Li Q (2006) Testing the significance of categorical predictor variables in nonparametric regression models. Econom Theory 25:1\u201342","journal-title":"Econom Theory"},{"key":"513_CR27","doi-asserted-by":"publisher","DOI":"10.1007\/b98888","volume-title":"Functional Data Analysis","author":"J Ramsay","year":"2005","unstructured":"Ramsay J, Silverman B (2005) Functional Data Analysis. Springer Series in Statistics, Springer, New York"},{"key":"513_CR28","unstructured":"Revelle W (2021) psychTools:Tools to Accompany the \u2019psych; Package for Psychological Research. Northwestern University, Evanston, Illinois, https:\/\/CRAN.R-project.org\/package=psychTools, r package version 2.1.6"},{"key":"513_CR29","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1023\/A:1009957816843","volume":"1","author":"R Selten","year":"1998","unstructured":"Selten R (1998) Axiomatic characterization of the quadratic scoring rule. Exp Econom 1:43\u201362","journal-title":"Exp Econom"},{"issue":"3","key":"513_CR30","doi-asserted-by":"publisher","first-page":"599","DOI":"10.1080\/10485252.2014.916806","volume":"26","author":"HL Shang","year":"2014","unstructured":"Shang HL (2014) Bayesian bandwidth estimation for a functional nonparametric regression model with mixed types of regressors and unknown error density. J of Nonparametric Statistics 26(3):599\u2013615","journal-title":"J of Nonparametric Statistics"},{"key":"513_CR31","unstructured":"Vahle NM, Tomasik MJ (2021) Declines in memory and physical functioning when young adults experience being old in virtual reality. Preprint, Repository: OSF https:\/\/osf.io\/h53rk\/"},{"key":"513_CR32","doi-asserted-by":"crossref","unstructured":"Venables WN, Ripley BD (2002) Modern Applied Statistics with S, 4th edn. Springer, New York. http:\/\/www.stats.ox.ac.uk\/pub\/MASS4\/, ISBN 0-387-95457-0","DOI":"10.1007\/978-0-387-21706-2_14"},{"key":"513_CR33","doi-asserted-by":"publisher","DOI":"10.1098\/rsos.211594","volume":"9","author":"F Vogel","year":"2022","unstructured":"Vogel F, Vahle NM, Gertheiss J, Tomasik MJ (2022) Supervised learning for analysing movement patterns in a virtual reality experiment. Royal Soc Open Sci 9:211594","journal-title":"Royal Soc Open Sci"},{"key":"513_CR34","first-page":"359","volume":"26","author":"GS Watson","year":"1964","unstructured":"Watson GS (1964) Smooth regression analysis. Sankhya Ser A 26:359\u2013372","journal-title":"Sankhya Ser A"},{"issue":"470","key":"513_CR35","doi-asserted-by":"publisher","first-page":"577","DOI":"10.1198\/016214504000001745","volume":"100","author":"F Yao","year":"2005","unstructured":"Yao F, M\u00fcller HG, Wang JL (2005) Functional data analysis for sparse longitudinal data. J of the Am Statistical Assoc 100(470):577\u2013590","journal-title":"J of the Am Statistical Assoc"}],"container-title":["Advances in Data Analysis and Classification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-022-00513-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11634-022-00513-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-022-00513-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,26]],"date-time":"2023-11-26T14:07:55Z","timestamp":1701007675000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11634-022-00513-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,2]]},"references-count":35,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,6]]}},"alternative-id":["513"],"URL":"https:\/\/doi.org\/10.1007\/s11634-022-00513-7","relation":{},"ISSN":["1862-5347","1862-5355"],"issn-type":[{"value":"1862-5347","type":"print"},{"value":"1862-5355","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,2]]},"assertion":[{"value":"30 November 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 June 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 August 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 September 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"There are no relevant financial or non-financial competing interests to report.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Statements and Declarations"}}]}}