{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T03:17:53Z","timestamp":1740107873296,"version":"3.37.3"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"13","license":[{"start":{"date-parts":[[2021,4,15]],"date-time":"2021-04-15T00:00:00Z","timestamp":1618444800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,4,15]],"date-time":"2021-04-15T00:00:00Z","timestamp":1618444800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100007195","name":"Universit\u00e0 degli Studi di Napoli Federico II","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100007195","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Soft Comput"],"published-print":{"date-parts":[[2021,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We present a numerical attribute dependency method for massive datasets based on the concepts of direct and inverse fuzzy transform. In a previous work, we used these concepts for numerical attribute dependency in data analysis: Therein, the multi-dimensional inverse fuzzy transform was useful for approximating a regression function. Here we give an extension of this method in massive datasets because the previous method could not be applied due to the high memory size. Our method is proved on a large dataset formed from 402,678 census sections of the Italian regions provided by the Italian National Statistical Institute (ISTAT) in 2011. The results of comparative tests with the well-known methods of regression, called support vector regression and multilayer perceptron, show that the proposed algorithm has comparable performance with those obtained using these two methods. Moreover, the number of parameters requested in our method is minor with respect to those of the cited in the above two algorithms.<\/jats:p>","DOI":"10.1007\/s00500-021-05760-y","type":"journal-article","created":{"date-parts":[[2021,4,15]],"date-time":"2021-04-15T13:59:27Z","timestamp":1618495167000},"page":"8731-8746","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Attribute dependency data analysis for massive datasets by fuzzy transforms"],"prefix":"10.1007","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5690-5384","authenticated-orcid":false,"given":"Ferdinando","family":"Di Martino","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Salvatore","family":"Sessa","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,4,15]]},"reference":[{"key":"5760_CR1","doi-asserted-by":"publisher","unstructured":"Anguita A, Ridella S, Rivieccio F (2005) K-fold generalization capability assessment for support vector classifiers. In:\u00a0Proceedings of the IEEE international joint conference on neural networks, IJCNN 2005, pp. 855\u2013858. https:\/\/doi.org\/10.1109\/IJCNN.2005.1555964","DOI":"10.1109\/IJCNN.2005.1555964"},{"key":"5760_CR2","doi-asserted-by":"publisher","first-page":"314","DOI":"10.1016\/j.ins.2014.01.015","volume":"275","author":"CLP Chen","year":"2014","unstructured":"Chen CLP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314\u2013347. https:\/\/doi.org\/10.1016\/j.ins.2014.01.015","journal-title":"Inf Sci"},{"key":"5760_CR3","doi-asserted-by":"publisher","unstructured":"Chen C, Li K, Duan M, Li K (2017) Chapter 6\u2014extreme learning machine and its applications in big data processing. In: Big data analytics for sensor-network collected intelligence, intelligent data-centric systems, pp. 117\u2013150. https:\/\/doi.org\/10.1016\/B9780128093931.000064.","DOI":"10.1016\/B9780128093931.000064"},{"issue":"4","key":"5760_CR4","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1109\/TKDE.2009.116","volume":"22","author":"CH Cheng","year":"2010","unstructured":"Cheng CH, Tan P, Jin R (2010) Efficient algorithm for localized support vector machine. IEEE Trans Knowl Data Eng 22(4):537\u2013549. https:\/\/doi.org\/10.1109\/TKDE.2009.116","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"5760_CR5","doi-asserted-by":"publisher","unstructured":"Collobert R, Bengio S (2004) Links between perceptrons, MLPs and SVMs. In: ICML '04: proceedings of the 21st international conference on machine learning. https:\/\/doi.org\/10.1145\/1015330.1015415","DOI":"10.1145\/1015330.1015415"},{"key":"5760_CR6","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1007\/BF02551274","volume":"2","author":"G Cybenko","year":"1989","unstructured":"Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signal Syst 2:303\u2013314. https:\/\/doi.org\/10.1007\/BF02551274","journal-title":"Math Control Signal Syst"},{"key":"5760_CR7","doi-asserted-by":"crossref","unstructured":"Dean J (2014) Big Data, data mining, and machine learning: value creation for business leaders and practitioners. Wiley & Sons Inc., New York. ISBN:15024629159781502462916","DOI":"10.1002\/9781118691786"},{"key":"5760_CR8","doi-asserted-by":"publisher","first-page":"2349","DOI":"10.1016\/j.ins.2006.12.027","volume":"177","author":"F Di Martino","year":"2007","unstructured":"Di Martino F, Sessa S (2007) Compression and decompression of image with discrete fuzzy transforms. Inf Sci 177:2349\u20132362. https:\/\/doi.org\/10.1016\/j.ins.2006.12.027","journal-title":"Inf Sci"},{"key":"5760_CR9","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1016\/j.ins.2012.01.014","volume":"195","author":"F Di Martino","year":"2012","unstructured":"Di Martino F, Sessa S (2012) Fragile watermarking tamper detection with images compressed by fuzzy transform. Inf Sci 195:62\u201390. https:\/\/doi.org\/10.1016\/j.ins.2012.01.014","journal-title":"Inf Sci"},{"key":"5760_CR10","doi-asserted-by":"publisher","first-page":"110","DOI":"10.1016\/j.ijar.2007.06.008","volume":"48","author":"F Di Martino","year":"2008","unstructured":"Di Martino F, Loia V, Perfilieva I, Sessa S (2008) An image coding\/decoding method based on direct and inverse fuzzy transforms. Int J Approx Reason 48:110\u2013131. https:\/\/doi.org\/10.1016\/j.ijar.2007.06.008","journal-title":"Int J Approx Reason"},{"key":"5760_CR11","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1016\/j.ins.2009.10.012","volume":"180","author":"F Di Martino","year":"2010","unstructured":"Di Martino F, Loia V, Sessa S (2010a) Fuzzy transforms method and attribute dependency in data analysis. Inf Sci 180:493\u2013505. https:\/\/doi.org\/10.1016\/j.ins.2009.10.012","journal-title":"Inf Sci"},{"key":"5760_CR12","doi-asserted-by":"publisher","first-page":"3914","DOI":"10.1016\/j.ins.2010.06.030","volume":"180","author":"F Di Martino","year":"2010","unstructured":"Di Martino F, Loia V, Sessa S (2010b) Fuzzy transforms for compression and decompression of color videos. Inf Sci 180:3914\u20133931. https:\/\/doi.org\/10.1016\/j.ins.2010.06.030","journal-title":"Inf Sci"},{"key":"5760_CR13","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1016\/j.fss.2010.11.009","volume":"180","author":"F Di Martino","year":"2011","unstructured":"Di Martino F, Loia V, Sessa S (2011a) Fuzzy transforms method in prediction data analysis. Fuzzy Sets Syst 180:146\u2013163. https:\/\/doi.org\/10.1016\/j.fss.2010.11.009","journal-title":"Fuzzy Sets Syst"},{"key":"5760_CR14","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1016\/j.fss.2009.08.002","volume":"161","author":"F Di Martino","year":"2011","unstructured":"Di Martino F, Loia V, Sessa S (2011b) A segmentation method for images compressed by fuzzy transforms. Fuzzy Sets Syst 161:56\u201374. https:\/\/doi.org\/10.1016\/j.fss.2009.08.002","journal-title":"Fuzzy Sets Syst"},{"key":"5760_CR15","unstructured":"Draper NR, Smith H (1988) Applied regression analysis. Wiley & Sons Inc., New York. ISBN: 9780471170822"},{"key":"5760_CR16","unstructured":"Drucker H, Burges CJC, Kaufman L, Smola AJ, Vapnik V (1996) Support vector regression machines. In: NIPS'96 proceedings of the 9th international conference on neural information processing systems 1996, pp. 155\u2013161. MIT Press"},{"issue":"1","key":"5760_CR17","doi-asserted-by":"publisher","first-page":"145","DOI":"10.4137\/CIN.S13875","volume":"13","author":"H Han","year":"2019","unstructured":"Han H, Jian X (2019) Overcome support vector machine diagnosis overfitting. Cancer Inform 13(1):145\u2013158. https:\/\/doi.org\/10.4137\/CIN.S13875","journal-title":"Cancer Inform"},{"key":"5760_CR18","unstructured":"Han M, Kamber M, Pei J (2012) Data mining: concepts and techniques, 3rd ed. Morgan Kaufmann (Elsevier). ISBN: 9780123814791"},{"key":"5760_CR19","doi-asserted-by":"publisher","DOI":"10.1007\/9780387848587","volume-title":"The elements of statistical learning: data mining, inference, and prediction","author":"T Hastie","year":"2009","unstructured":"Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, New York. https:\/\/doi.org\/10.1007\/9780387848587"},{"key":"5760_CR20","unstructured":"Haykin S (1999) Neural networks: a comprehensive foundation, 2nd ed. Prentice Hall. ISBN: 0132733501"},{"key":"5760_CR21","unstructured":"Haykin S (2009) Neural networks and learning machines, 3rd ed. Prentice Hall. ISBN: 100131471392"},{"key":"5760_CR22","unstructured":"Johnson RA, Wichern DW (1992) Applied multivariate statistical analysis. Prentice-Hall International, London. ISBN: 9780131877153"},{"issue":"5","key":"5760_CR23","doi-asserted-by":"publisher","first-page":"21","DOI":"10.14257\/ijseia.2015.9.5.03","volume":"9","author":"S Jun","year":"2015","unstructured":"Jun S, Lee SJ, Ryu JB (2015) A divided regression analysis for big data. Int J Softw Eng Appl 9(5):21\u201332. https:\/\/doi.org\/10.14257\/ijseia.2015.9.5.03","journal-title":"Int J Softw Eng Appl"},{"key":"5760_CR24","doi-asserted-by":"crossref","unstructured":"Lee YS, Yen SJ (2004) Classification based on attribute dependency. In: Proceedings of 6th international conference DaWaK\u2019 04. Lecture Notes in Computer Sciences, 5192:259\u2013268. ISBN: 9783540876045","DOI":"10.1007\/978-3-540-30076-2_26"},{"key":"5760_CR25","doi-asserted-by":"crossref","unstructured":"Leskovec J, Rajaraman A, Ullmann JD (2014) Mining of massive datasets. Cambridge University Press, 2nd ed. ISBN: 9781107077232","DOI":"10.1017\/CBO9781139924801"},{"issue":"1","key":"5760_CR26","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1109\/72.977258","volume":"13","author":"S Mitra","year":"2002","unstructured":"Mitra S, Pal SK, Mitra P (2002) Data mining in soft computing framework: a survey. IEEE Trans Neural Netw 13(1):3\u201314. https:\/\/doi.org\/10.1109\/72.977258","journal-title":"IEEE Trans Neural Netw"},{"issue":"5\u20136","key":"5760_CR27","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1016\/0925-2312(91)900235","volume":"2","author":"F Murtagh","year":"1991","unstructured":"Murtagh F (1991) Multilayer perceptrons for classification and regression. Neurocomputing 2(5\u20136):183\u2013197. https:\/\/doi.org\/10.1016\/0925-2312(91)900235","journal-title":"Neurocomputing"},{"key":"5760_CR28","doi-asserted-by":"publisher","unstructured":"Peng H, Choi D, Liang C (2013) Evaluating parallel logistic regression models. In: 2013 IEEE international conference on big data, Silicon Valley, CA, USA, 6\u20139\/10\/2013. https:\/\/doi.org\/10.1109\/BigData.2013.6691743","DOI":"10.1109\/BigData.2013.6691743"},{"key":"5760_CR29","doi-asserted-by":"publisher","first-page":"993","DOI":"10.1016\/j.fss.2005.11.012","volume":"157","author":"I Perfilieva","year":"2006","unstructured":"Perfilieva I (2006) Fuzzy transforms: theory and applications. Fuzzy Sets Syst 157:993\u20131023. https:\/\/doi.org\/10.1016\/j.fss.2005.11.012","journal-title":"Fuzzy Sets Syst"},{"key":"5760_CR30","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1016\/j.ijar.2007.06.003","volume":"48","author":"I Perfilieva","year":"2008","unstructured":"Perfilieva I, Nov\u00e0k V, Dvor\u00e0k A (2008) Fuzzy transforms in the analysis of data. Int J Approx Reason 48:36\u201346. https:\/\/doi.org\/10.1016\/j.ijar.2007.06.003","journal-title":"Int J Approx Reason"},{"key":"5760_CR31","unstructured":"Piatecky-Shapiro G, Frawley WJ (1991) Knowledge discovery in databases. Cambridge (MA), MIT Press. ISBN: 9780262660709"},{"issue":"20","key":"5760_CR32","first-page":"331","volume":"118","author":"KS Raju","year":"2018","unstructured":"Raju KS, Murti MR, Rao MV, Satapathy SC (2018) Support vector machine with K-fold cross validation model for software fault prediction. Int J Pure Appl Math 118(20):331\u2013334","journal-title":"Int J Pure Appl Math"},{"key":"5760_CR33","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1016\/j.neunet.2014.09.003","volume":"61","author":"J Schmidhube","year":"2014","unstructured":"Schmidhube J (2014) Deep learning in neural networks: an overview. Neural Netw 61:85\u2013117. https:\/\/doi.org\/10.1016\/j.neunet.2014.09.003","journal-title":"Neural Netw"},{"key":"5760_CR34","doi-asserted-by":"publisher","unstructured":"Segata N, Blanzieri E (2009) Fast local support vector machines for large datasets. In: Perner P (ed) Machine learning and data mining in pattern recognition. MLDM 2009. Lecture notes in computer science, vol 5632. Springer, Berlin, pp 295\u2013310. https:\/\/doi.org\/10.1007\/978-3-642-03070-3_22","DOI":"10.1007\/978-3-642-03070-3_22"},{"issue":"4","key":"5760_CR35","first-page":"135","volume":"3","author":"S Singh","year":"2015","unstructured":"Singh S, Firdaus T, Sharma AK (2015) Survey on big data using data mining. Int J Eng Dev Res 3(4):135\u2013143","journal-title":"Int J Eng Dev Res"},{"key":"5760_CR36","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1016\/01650114(87)900339","volume":"24","author":"H Tanaka","year":"1987","unstructured":"Tanaka H (1987) Fuzzy data analysis by possibilistic linear models. Fuzzy Sets Syst 24:363\u2013375. https:\/\/doi.org\/10.1016\/01650114(87)900339","journal-title":"Fuzzy Sets Syst"},{"issue":"2","key":"5760_CR37","doi-asserted-by":"publisher","first-page":"437","DOI":"10.1007\/s1106301493665","volume":"42","author":"P Thomas","year":"2015","unstructured":"Thomas P, Suhner MC (2015) A new multilayer perceptron pruning algorithm for classification and regression application. Neural Process Lett 42(2):437\u2013458. https:\/\/doi.org\/10.1007\/s1106301493665","journal-title":"Neural Process Lett"},{"issue":"7","key":"5760_CR38","doi-asserted-by":"publisher","first-page":"2738","DOI":"10.1016\/j.eswa.2012.11.019","volume":"40","author":"M Vucetic","year":"2013","unstructured":"Vucetic M, Hudec M, Vujo\u0161evi\u0107 M (2013) A new method for computing fuzzy functional dependencies in relational database systems. Expert Syst Appl 40(7):2738\u20132745. https:\/\/doi.org\/10.1016\/j.eswa.2012.11.019","journal-title":"Expert Syst Appl"},{"issue":"1","key":"5760_CR39","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1111\/rssc.12068","volume":"64","author":"SN Wood","year":"2015","unstructured":"Wood SN, Goude Y, Shaw S (2015) Generalized additive models for large data sets. J R Stat Soc Ser C (Appl Stat) 64(1):139\u2013155. https:\/\/doi.org\/10.1111\/rssc.12068","journal-title":"J R Stat Soc Ser C (Appl Stat)"},{"issue":"1","key":"5760_CR40","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1109\/TKDE.2013.109","volume":"26","author":"X Wu","year":"2014","unstructured":"Wu X, Zhu X, Wu GQ, Ding W (2014) Data mining with Big Data. IEEE Trans Knowl Data Eng 26(1):97\u2013107. https:\/\/doi.org\/10.1109\/TKDE.2013.109","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"5760_CR41","doi-asserted-by":"publisher","first-page":"450","DOI":"10.1016\/j.engappai.2019.03.011","volume":"81","author":"L Yao","year":"2019","unstructured":"Yao L, Ge Z (2019) Distributed parallel deep learning of hierarchical extreme learning Machine for multimode quality prediction with big process data. Eng Appl Artif Intell 81:450\u2013465. https:\/\/doi.org\/10.1016\/j.engappai.2019.03.011","journal-title":"Eng Appl Artif Intell"},{"issue":"10","key":"5760_CR42","doi-asserted-by":"publisher","first-page":"12328","DOI":"10.1016\/j.eswa.2011.04.011","volume":"38","author":"SJ Yen","year":"2011","unstructured":"Yen SJ, Lee YS (2011) A neural network approach to discover attribute dependency for improving the performance of classification. Expert Syst Appl 38(10):12328\u201312338. https:\/\/doi.org\/10.1016\/j.eswa.2011.04.011","journal-title":"Expert Syst Appl"},{"issue":"5","key":"5760_CR43","doi-asserted-by":"publisher","first-page":"1023","DOI":"10.1007\/s0052101107931","volume":"22","author":"J Zheng","year":"2013","unstructured":"Zheng J, Shen F, Fan H, Zhao J (2013) An online incremental learning support vector machine for large-scale data. Neural Compu Appl 22(5):1023\u20131035. https:\/\/doi.org\/10.1007\/s0052101107931","journal-title":"Neural Compu Appl"}],"container-title":["Soft Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00500-021-05760-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00500-021-05760-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00500-021-05760-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,6,14]],"date-time":"2021-06-14T13:14:00Z","timestamp":1623676440000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00500-021-05760-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,15]]},"references-count":43,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2021,7]]}},"alternative-id":["5760"],"URL":"https:\/\/doi.org\/10.1007\/s00500-021-05760-y","relation":{},"ISSN":["1432-7643","1433-7479"],"issn-type":[{"type":"print","value":"1432-7643"},{"type":"electronic","value":"1433-7479"}],"subject":[],"published":{"date-parts":[[2021,4,15]]},"assertion":[{"value":"19 March 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 April 2021","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"This research does not contain any studies involving human participants performed by any of the authors.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}},{"value":"Informed consent was obtained from all individual participants included in the study.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Informed consent"}}]}}