{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,7]],"date-time":"2026-01-07T07:54:06Z","timestamp":1767772446638,"version":"3.37.3"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,11,11]],"date-time":"2020-11-11T00:00:00Z","timestamp":1605052800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,11,11]],"date-time":"2020-11-11T00:00:00Z","timestamp":1605052800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n<jats:p>In the light of the recent technological advances in computing and data explosion, the complex interactions of the Sustainable Development Goals (SDG) present both a challenge and an opportunity to researchers and decision makers across fields and sectors. The deep and wide socio-economic, cultural and technological variations across the globe entail a unified understanding of the SDG project. The complexity of SDGs interactions and the dynamics through their indicators align naturally to technical and application specifics that require interdisciplinary solutions. We present a consilient approach to expounding triggers of SDG indicators. Illustrated through data segmentation, it is designed to unify our understanding of the complex overlap of the SDGs by utilising data from different sources. The paper treats each SDG as a Big Data source node, with the potential to contribute towards a unified understanding of applications across the SDG spectrum. Data for five SDGs was extracted from the United Nations SDG indicators data repository and used to model spatio-temporal variations in search of robust and consilient scientific solutions. Based on a number of pre-determined assumptions on socio-economic and geo-political variations, the data is subjected to sequential analyses, exploring distributional behaviour, component extraction and clustering. All three methods exhibit pronounced variations across samples, with initial distributional and data segmentation patterns isolating South Africa from the remaining five countries. Data randomness is dealt with via a specially developed algorithm for sampling, measuring and assessing, based on repeated samples of different sizes. Results exhibit consistent variations across samples, based on socio-economic, cultural and geo-political variations entailing a unified understanding, across disciplines and sectors. The findings highlight novel paths towards attaining informative patterns for a unified understanding of the triggers of SDG indicators and open new paths to interdisciplinary research.<\/jats:p>","DOI":"10.1186\/s40537-020-00373-y","type":"journal-article","created":{"date-parts":[[2020,11,11]],"date-time":"2020-11-11T14:03:09Z","timestamp":1605103389000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["A robust machine learning approach to SDG data segmentation"],"prefix":"10.1186","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1134-547X","authenticated-orcid":false,"given":"Kassim S.","family":"Mwitondi","sequence":"first","affiliation":[]},{"given":"Isaac","family":"Munyakazi","sequence":"additional","affiliation":[]},{"given":"Barnabas N.","family":"Gatsheni","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,11,11]]},"reference":[{"key":"373_CR1","unstructured":"SDG, Sustainable Development Goals. 2015; https:\/\/www.un.org\/sustainabledevelopment\/sustainable-development-goals\/"},{"key":"373_CR2","unstructured":"SDGI, Sustainable Development Goals Indicators. 2017; https:\/\/unstats.un.org\/sdgs\/indicators\/database\/"},{"key":"373_CR3","first-page":"17","volume":"34","author":"A Kharrazi","year":"2017","unstructured":"Kharrazi A. Challenges and opportunities of urban big-data for sustainable development. Asia Pacific Tech Monitor. 2017;34:17\u2013211.","journal-title":"Asia Pacific Tech Monitor"},{"key":"373_CR4","doi-asserted-by":"publisher","first-page":"e38","DOI":"10.2196\/medinform.5359","volume":"4","author":"CS Kruse","year":"2016","unstructured":"Kruse CS, Goswamy R, Raval Y, Marawi S. Challenges and opportunities of big data in health care: a systematic review. JMIR Med Inf. 2016;4:e38.","journal-title":"JMIR Med Inf"},{"key":"373_CR5","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1016\/j.future.2014.10.029","volume":"51","author":"M Yan","year":"2015","unstructured":"Yan M, Haiping W, Lizhe W, Bormin H, Ranjan R, Zomaya A, Wei J. Remote sensing big data computing: challenges and opportunities. Future Gener Comput Syst. 2015;51:47\u201360.","journal-title":"Future Gener Comput Syst"},{"key":"373_CR6","unstructured":"IUCN, In the spirit of nature, everything is connected. 2018; https:\/\/www.iucn.org\/news\/europe\/201801\/spirit-nature-everything-connected"},{"key":"373_CR7","unstructured":"Mwitondi\u00a0KS. Tracking the Potential, Development, and Impact of Information and Communication Technologies in Sub-Saharan Africa; International Council for Science (ICSU-ROA);2018"},{"key":"373_CR8","doi-asserted-by":"crossref","unstructured":"Meusburger\u00a0P. In Knowledge and the Economy; Meusburger,\u00a0P., Gl\u00fcckler,\u00a0J., el\u00a0Meskioui,\u00a0M., Eds.; Springer Netherlands: Dordrecht, 2013; pp 15\u201342","DOI":"10.1007\/978-94-007-6131-5"},{"key":"373_CR9","doi-asserted-by":"crossref","unstructured":"Parr\u00a0M, Musker\u00a0R, Schaap\u00a0B. GODAN\u2019S Impact 2014 to 2018 - Improving Agriculture, Food and Nutrition with Open Data;2018","DOI":"10.1079\/CABICOMM-25-8088"},{"key":"373_CR10","unstructured":"UN-Global-Pulse, Big Data for Development: Challenges and Opportunities.UN Global Pulse. 2012"},{"key":"373_CR11","unstructured":"UN-Global-Pulse, Big Data for Development and Humanitarian Action: Towards Responsible Governance. 2016"},{"key":"373_CR12","unstructured":"Bamberger\u00a0M. Integrating Big Data Into the Monitoring and Evaluation of Development Programmes. 2016"},{"key":"373_CR13","unstructured":"Roser\u00a0M, Ortiz-Ospina\u00a0E, Ritchie\u00a0H, Hasell\u00a0J, Gavrilov\u00a0D. Our World in data: Research and interactive data visualizations to understand the world\u2019s largest problems;2018"},{"key":"373_CR14","unstructured":"WBGroup, Atlas of Sustainable Development Goals From World Development Indicators. 2018"},{"key":"373_CR15","unstructured":"Mwitondi\u00a0K, Munyakazi\u00a0I, Gatsheni\u00a0B. An interdisciplinary data-driven framework for development science. DIRISA National Research Data Workshop, CSIR ICC, 19-21 June 2018, Pretoria, RSA2018"},{"key":"373_CR16","unstructured":"Mwitondi\u00a0K, Munyakazi\u00a0I, Gatsheni\u00a0B. Amenability of the United Nations Sustainable Development Goals to Big Data Modelling. International Workshop on Data Science-Present and Future of Open Data and Open Science, 12-15 Nov 2018, Joint Support Centre for Data Science Research, Mishima Citizens Cultural Hall, Mishima, Shizuoka, Japan2018"},{"key":"373_CR17","unstructured":"Ishikawa\u00a0K. Guide to auality control; Asian Productivity Organization;1976"},{"key":"373_CR18","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1016\/j.ecoser.2012.07.008","volume":"1","author":"E Primmer","year":"2012","unstructured":"Primmer E, Furman E. Operationalising ecosystem service approaches for governance: do measuring, mapping and valuing integrate sector-specific knowledge systems? Ecosyst Serv. 2012;1:85\u201392.","journal-title":"Ecosyst Serv"},{"issue":"3","key":"373_CR19","doi-asserted-by":"publisher","first-page":"293","DOI":"10.12785\/jsap\/020312","volume":"2","author":"KS Mwitondi","year":"2013","unstructured":"Mwitondi KS, Said RA. A data-based method for harmonising heterogeneous data modelling techniques across data mining applications. J Stat Appl Probab. 2013;2(3):293\u2013305.","journal-title":"J Stat Appl Probab"},{"key":"373_CR20","doi-asserted-by":"publisher","first-page":"WDS247","DOI":"10.2481\/dsj.WDS-045","volume":"12","author":"KS Mwitondi","year":"2013","unstructured":"Mwitondi KS, Moustafa RE, Hadi AS. A data-driven method for selecting optimal models based on graphical visualisation of differences in sequentially fitted ROC model parameters. Data Sci J. 2013;12:WDS247\u2013WDS253.","journal-title":"Data Sci J"},{"key":"373_CR21","unstructured":"SDGTI, Sustainable Development Goals Targets & Indicators. 2020; https:\/\/unstats.un.org\/sdgs\/metadata\/"},{"key":"373_CR22","unstructured":"Lloyd\u00a0SP. Least squares quantization in PCM. Technical Report RR-5497, Bell Laboratories. 1957."},{"key":"373_CR23","unstructured":"MacQueen JB. Some methods for classification and analysis of multivariate observations. 1967;1:281\u201397."},{"key":"373_CR24","volume-title":"Machine learning algorithms","author":"J Chapmann","year":"2017","unstructured":"Chapmann\u00a0J. Machine learning algorithms; CreateSpace Independent Publishing Platform, 2017"},{"key":"373_CR25","volume-title":"Introduction to clustering large and high-dimensional data","author":"J Kogan","year":"2007","unstructured":"Kogan J. Introduction to clustering large and high-dimensional data. Cambridge: Cambridge University Press; 2007."},{"key":"373_CR26","first-page":"230","volume":"27","author":"KS Mwitondi","year":"2018","unstructured":"Mwitondi KS, Zargari SA. An iterative multiple sampling method for intrusion detection. Inf Secur J. 2018;27:230\u20139.","journal-title":"Inf Secur J"},{"key":"373_CR27","doi-asserted-by":"publisher","first-page":"961","DOI":"10.1162\/neco.2006.18.4.961","volume":"18","author":"L Bo","year":"2006","unstructured":"Bo L, Wang L, Jiao L. Feature scaling for Kernel Fisher discriminant analysis using leave-one-out cross validation. Neural Comput. 2006;18:961\u201378.","journal-title":"Neural Comput"},{"key":"373_CR28","first-page":"507780","volume":"1","author":"F Galkin","year":"2018","unstructured":"Galkin F, Aliper A, Putin E, Kuznetsov I, Gladyshev VN, Zhavoronkov A. Human microbiome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects. BioRxiv. 2018;1:507780.","journal-title":"BioRxiv."},{"key":"373_CR29","doi-asserted-by":"publisher","first-page":"377","DOI":"10.1145\/362384.362685","volume":"13","author":"EF Codd","year":"1970","unstructured":"Codd EF. A relational model of data for large shared data banks. Commun ACM. 1970;13:377\u201387.","journal-title":"Commun ACM"},{"key":"373_CR30","unstructured":"SDGCA, Sustainable Development Goals. 2015; https:\/\/sdgcafrica.org\/"}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-020-00373-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s40537-020-00373-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-020-00373-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,11]],"date-time":"2020-11-11T14:42:31Z","timestamp":1605105751000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-020-00373-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,11]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["373"],"URL":"https:\/\/doi.org\/10.1186\/s40537-020-00373-y","relation":{},"ISSN":["2196-1115"],"issn-type":[{"type":"electronic","value":"2196-1115"}],"subject":[],"published":{"date-parts":[[2020,11,11]]},"assertion":[{"value":"19 May 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 November 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 November 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"All the three authors declare that there are no competing interests in publishing this paper, be they financial or non-financial and, as a co-authored paper, the costs of publishing will be shared among the three institutions.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"97"}}