{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T02:42:46Z","timestamp":1768012966927,"version":"3.49.0"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,9,25]],"date-time":"2021-09-25T00:00:00Z","timestamp":1632528000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,9,25]],"date-time":"2021-09-25T00:00:00Z","timestamp":1632528000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2022,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>An object-based cloud storage system is a storage platform where big data is managed through the internet and data is considered as an object. A smart storage system should be able to handle the big data variety property by recommending the storage space for each data type automatically. Machine learning can help make a storage system automatic. This article proposes a classification engine framework for this purpose by utilizing a machine learning strategy. A feature selection approach wrapped with a classifier is proposed to automatically predict the proper storage space for the incoming big data. It helps build an automatic storage space recommendation system for an object-based cloud storage platform. To find out a suitable combination of feature selection algorithms and classifiers for the proposed classification engine, a comparative study of different supervised feature selection algorithms (i.e., Fisher score, F-score, Lll21) from three categories (similarity, statistical, sparse learning) associated with various classifiers (i.e., SVM,<jats:italic>K<\/jats:italic>-NN, Neural Network) is performed. We illustrate our study using RSoS system as it provides a cloud storage platform for the healthcare data as experimental big data by considering its variety property. The experiments confirm that Lll21 feature selection combined with<jats:italic>K<\/jats:italic>-NN classifier provides better performance than the others.<\/jats:p>","DOI":"10.1007\/s40747-021-00517-4","type":"journal-article","created":{"date-parts":[[2021,9,25]],"date-time":"2021-09-25T17:02:40Z","timestamp":1632589360000},"page":"489-505","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Machine learning-driven automatic storage space recommendation for object-based cloud storage system"],"prefix":"10.1007","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3247-357X","authenticated-orcid":false,"given":"Anindita Sarkar","family":"Mondal","sequence":"first","affiliation":[]},{"given":"Anirban","family":"Mukhopadhyay","sequence":"additional","affiliation":[]},{"given":"Samiran","family":"Chattopadhyay","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,9,25]]},"reference":[{"key":"517_CR1","doi-asserted-by":"crossref","unstructured":"Bahrami M, Singhal M (2015) The role of cloud computing architecture in big data. In: Information granularity, big data, and computational intelligence. Springer, pp 275\u2013295","DOI":"10.1007\/978-3-319-08254-7_13"},{"key":"517_CR2","doi-asserted-by":"crossref","unstructured":"Bisong E (2019) Google cloud machine learning engine (cloud mle). In: Building machine learning and deep learning models on Google Cloud Platform. Springer, pp 545\u2013579","DOI":"10.1007\/978-1-4842-4470-8_41"},{"key":"517_CR3","unstructured":"Borthakur D (2008) Hdfs architecture guide. Hadoop Apache Project 53"},{"key":"517_CR4","unstructured":"Cassandra. http:\/\/cassandra.apache.org\/"},{"key":"517_CR5","doi-asserted-by":"crossref","unstructured":"Chen Y-W, Lin C-J (2006) Combining svms with various feature selection strategies. In: Feature extraction. Springer, pp 315\u2013324","DOI":"10.1007\/978-3-540-35488-8_13"},{"issue":"2","key":"517_CR6","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1109\/MCC.2014.29","volume":"1","author":"E Collins","year":"2014","unstructured":"Collins E (2014) Big data in the public cloud. IEEE Cloud Comput 1(2):13\u201315","journal-title":"IEEE Cloud Comput"},{"issue":"8","key":"517_CR7","first-page":"1","volume":"34","author":"P Cunningham","year":"2007","unstructured":"Cunningham P, Delany SJ (2007) k-nearest neighbour classifiers. Mult Classif Syst 34(8):1\u201317","journal-title":"Mult Classif Syst"},{"key":"517_CR8","unstructured":"forcepoint (2019) Forcepoint advanced classification engine (ace). https:\/\/www.forcepoint.com\/product\/add-on\/advanced-classification-engine-ace?utm\\_source=Websense&utm\\_medium=Redirect&utm\\_content=websense-advanced-classification-engine%3Fcmpid%3Dslblog]. Accessed 19 Nov 2019"},{"key":"517_CR9","unstructured":"Gartner (2020) Aiops (artificial intelligence for it operations). https:\/\/www.gartner.com\/en\/information-technology\/glossary\/aiops-artificial-intelligence-operations. Accessed 29 June 2020"},{"key":"517_CR10","doi-asserted-by":"crossref","unstructured":"Giudice O, Paratore A, Moltisanti M, Battiato S (2017) A classification engine for image ballistics of social data. Springer, pp 625\u2013636","DOI":"10.1007\/978-3-319-68548-9_57"},{"issue":"6","key":"517_CR11","doi-asserted-by":"publisher","first-page":"828","DOI":"10.1177\/0272989X10393976","volume":"31","author":"PKJ Han","year":"2011","unstructured":"Han PKJ, Klein WMP, Arora NK (2011) Varieties of uncertainty in health care: a conceptual taxonomy. Med Decis Mak 31(6):828\u2013838","journal-title":"Med Decis Mak"},{"key":"517_CR12","doi-asserted-by":"crossref","unstructured":"Herbrich R (2017) Machine learning at amazon. In: WSDM, p 535","DOI":"10.1145\/3018661.3022764"},{"key":"517_CR13","unstructured":"IBM (2020) Ibm cloud object storage. https:\/\/www.ibm.com\/cloud\/object-storage. Accessed 29 June 2020"},{"key":"517_CR14","unstructured":"Japkowicz N (2006) Why question machine learning evaluation methods. In: AAAI workshop on evaluation methods for machine learning, pp 6\u201311"},{"issue":"3","key":"517_CR15","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1109\/MC.2015.77","volume":"48","author":"K Kaur","year":"2015","unstructured":"Kaur K, Rani R (2015) Managing data in healthcare information systems: many models, one solution. Computer 48(3):52\u201359","journal-title":"Computer"},{"key":"517_CR16","doi-asserted-by":"crossref","unstructured":"Klein S (2017) Azure data factory. Apress, pp 105\u2013122","DOI":"10.1007\/978-1-4842-2143-3_7"},{"key":"517_CR17","doi-asserted-by":"crossref","unstructured":"Levin A, Garion S, Kolodner EK, Lorenz DH, Barabash K, Kugler M, McShane N (2019). Aiops for a cloud object storage service. IEEE, pp 165\u2013169","DOI":"10.1109\/BigDataCongress.2019.00036"},{"key":"517_CR18","doi-asserted-by":"crossref","unstructured":"Li Y, Guo L, Wu C, Lee C-H, Guo Y (2014) Building a cloud-based platform for personal health sensor data management. IEEE, pp 223\u2013226","DOI":"10.1109\/BHI.2014.6864344"},{"key":"517_CR19","unstructured":"Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. AUAI Press, pp 339\u2013348"},{"key":"517_CR20","doi-asserted-by":"publisher","DOI":"10.7717\/peerj-cs.52","volume":"2","author":"A MacDonald","year":"2016","unstructured":"MacDonald A (2016) Phildb: the time series database with built-in change logging. PeerJ Comput Sci 2:e52","journal-title":"PeerJ Comput Sci"},{"key":"517_CR21","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1016\/j.parco.2004.04.001","volume":"30","author":"ML Massie","year":"2004","unstructured":"Massie ML, Chun BN, Culler DE (2004) The ganglia distributed monitoring system: design, implementation, and experience. Parallel Comput 30:817\u201340","journal-title":"Parallel Comput"},{"key":"517_CR22","unstructured":"McKay C, Fiebrink R, McEnnis D, Li B, Fujinaga I (2005) Ace: a framework for optimizing music classification. In: ISMIR, pp 42\u201349"},{"key":"517_CR23","doi-asserted-by":"crossref","unstructured":"Mondal AS, Chattopadhyay S, Neogy S, Mukherjee N (2016) Object based schema oriented data storage system for supporting heterogeneous data, pp 1025\u20131032","DOI":"10.1109\/ICACCI.2016.7732179"},{"key":"517_CR24","doi-asserted-by":"crossref","unstructured":"Mondal AS, Neogy S, Mukherjee N, Chattopadhyay S (2019) Performance analysis of an efficient object-based schema oriented data storage system handling health data, pp 1\u201315","DOI":"10.1007\/s11334-019-00354-2"},{"key":"517_CR25","unstructured":"Mongodb. https:\/\/www.mongodb.org\/"},{"key":"517_CR26","doi-asserted-by":"crossref","unstructured":"Noel RR, Mehra R, Lama P (2019) Towards self-managing cloud storage with reinforcement learning. IEEE, pp 34\u201344","DOI":"10.1109\/IC2E.2019.000-9"},{"key":"517_CR27","unstructured":"Openstack swift. https:\/\/www.swiftstack.com\/docs\/introduction\/openstack_swift.html"},{"key":"517_CR28","doi-asserted-by":"crossref","unstructured":"Palankar MR, Iamnitchi A, Ripeanu M, Garfinkel S (2008) Amazon s3 for science grids: a viable solution? New York","DOI":"10.1145\/1383519.1383526"},{"key":"517_CR29","unstructured":"PSIGEN (2019) Psigen releases accelerated classification engine. https:\/\/www.psigen.com\/?s=Accelerated+Classification+Engine. Accessed 19 Nov 2019"},{"key":"517_CR30","doi-asserted-by":"crossref","unstructured":"Ren J, Chen X, Tan Y, Liu D, Duan M, Liang L, Qiao L (2019) Archivist: a machine learning assisted data placement mechanism for hybrid storage systems. IEEE, pp 676\u2013679","DOI":"10.1109\/ICCD46524.2019.00098"},{"key":"517_CR31","doi-asserted-by":"crossref","unstructured":"Sarkar A, Pant K, Chattopadhyay S (2018) Drsq-a dynamic resource service quality based load balancing algorithm. In: International conference on computational intelligence, communications, and business analytics. Springer, pp 97\u2013108","DOI":"10.1007\/978-981-13-8581-0_8"},{"key":"517_CR32","volume-title":"Gessert F (2015) Ritter Norbert (2015) Towards automated polyglot persistence","author":"M Schaarschmidt","year":"2015","unstructured":"Schaarschmidt M (2015) Gessert F (2015) Ritter Norbert (2015) Towards automated polyglot persistence. Datenbanksysteme f\u00fcr Business, Technologie und Web (BTW"},{"key":"517_CR33","doi-asserted-by":"crossref","unstructured":"Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization. optimization, and beyond. MIT Press","DOI":"10.7551\/mitpress\/4175.001.0001"},{"key":"517_CR34","unstructured":"Shah G, Voruganti K, Shivam P, Alvarez M (2006) Ace: classification for information lifecycle management"},{"key":"517_CR35","volume-title":"Beatbox classification using ace","author":"E Sinyor","year":"2005","unstructured":"Sinyor E, Rebecca CM, Mcennis D, Fujinaga I (2005) Beatbox classification using ace. Music Information Retrieval, Citeseer"},{"issue":"4","key":"517_CR36","doi-asserted-by":"publisher","first-page":"427","DOI":"10.1016\/j.ipm.2009.03.002","volume":"45","author":"M Sokolova","year":"2009","unstructured":"Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inform Process Manag 45(4):427\u2013437","journal-title":"Inform Process Manag"},{"issue":"6","key":"517_CR37","doi-asserted-by":"publisher","first-page":"568","DOI":"10.1109\/72.97934","volume":"2","author":"DF Specht","year":"1991","unstructured":"Specht DF (1991) A general regression neural network. IEEE Trans Neural Netw 2(6):568\u2013576","journal-title":"IEEE Trans Neural Netw"},{"issue":"03","key":"517_CR38","first-page":"54","volume":"15","author":"M Stonebraker","year":"2013","unstructured":"Stonebraker M, Brown P, Zhang D, Becla J (2013) Scidb: a database management system for applications with complex analytics. IEEE Ann Hist Comput 15(03):54\u201362","journal-title":"IEEE Ann Hist Comput"},{"key":"517_CR39","doi-asserted-by":"crossref","unstructured":"Trivedi K, Shah S, Srivastava K (2020) An efficient e-commerce design by implementing a novel data mapper for polyglot persistence. In: Advanced computing technologies and applications. Springer, pp 149\u2013156","DOI":"10.1007\/978-981-15-3242-9_15"},{"key":"517_CR40","unstructured":"Varonis (2019) Varonis, data classification engine. https:\/\/www.varonis.com\/products\/data-classification-engine\/. Accessed 19 Nov 2019"},{"key":"517_CR41","unstructured":"Veritas (2019) Veritas introduces new classification engine for intelligent data management across its portfolio. https:\/\/www.veritas.com\/news-releases\/2017-07-25-veritas-introduces-new-classification-engine-for-intelligent-data-management-across-its-portfolio. Accessed 19 Nov 2019"},{"key":"517_CR42","unstructured":"websense (2019) Advanced analysis using real-time classification. https:\/\/www.websense.com\/content\/support\/library\/web\/hosted\/bsky_help\/content_analysis.aspx. Accessed 19 Nov 2019"},{"key":"517_CR43","unstructured":"Weil SA (2007) Ceph: reliable, scalable, and high-performance distributed storage. PhD thesis. University of California Santa Cruz"},{"key":"517_CR44","unstructured":"Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2001) Feature selection for svms. In: Advances in neural information processing systems, pp 668\u2013674"},{"key":"517_CR45","unstructured":"Zeng L-F, Feng D, Qin LJ (2004) Soss: smart object-based storage system. In: Proceedings of 2004 international conference on machine learning and cybernetics (IEEE Cat. No. 04EX826), vol 5. IEEE, pp 3263\u20133266"},{"key":"517_CR46","unstructured":"Zeng L-F, Feng D, Wang F, Zhou K (2005) Object replication and migration policy based on oss, vol 1. IEEE, pp 45\u201349"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-021-00517-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-021-00517-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-021-00517-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,8]],"date-time":"2024-09-08T20:27:29Z","timestamp":1725827249000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-021-00517-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,25]]},"references-count":46,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,2]]}},"alternative-id":["517"],"URL":"https:\/\/doi.org\/10.1007\/s40747-021-00517-4","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,25]]},"assertion":[{"value":"12 April 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 August 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 September 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us. On behalf of all authors, the corresponding author states that there is no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"The dataset is given in the paper as paper content.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Availability of data and material"}},{"value":"Code is uploaded and available in the GitHub and will be made available for public after the acceptance of the article, if required.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}}]}}