{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:19:52Z","timestamp":1740122392285,"version":"3.37.3"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2019,10,28]],"date-time":"2019-10-28T00:00:00Z","timestamp":1572220800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,10,28]],"date-time":"2019-10-28T00:00:00Z","timestamp":1572220800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003382","name":"Core Research for Evolutional Science and Technology","doi-asserted-by":"publisher","award":["JPMJCR1304"],"award-info":[{"award-number":["JPMJCR1304"]}],"id":[{"id":"10.13039\/501100003382","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Min Knowl Disc"],"published-print":{"date-parts":[[2020,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We consider the class of linear predictors over all logical conjunctions of binary attributes, which we refer to as the class of combinatorial binary models\u00a0(CBMs) in this paper. CBMs are of high knowledge interpretability but na\u00efve learning of them from labeled data requires exponentially high computational cost with respect to the length of the conjunctions. On the other hand, in the case of large-scale datasets, long conjunctions are effective for learning predictors. To overcome this computational difficulty, we propose an algorithm,<jats:italic>GRAfting for Binary datasets (GRAB)<\/jats:italic>, which efficiently learns CBMs within the<jats:inline-formula><jats:alternatives><jats:tex-math>$$L_1$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:msub><mml:mi>L<\/mml:mi><mml:mn>1<\/mml:mn><\/mml:msub><\/mml:math><\/jats:alternatives><\/jats:inline-formula>-regularized loss minimization framework. The key idea of GRAB is to adopt weighted frequent itemset mining for the most time-consuming step in the grafting algorithm, which is designed to solve large-scale<jats:inline-formula><jats:alternatives><jats:tex-math>$$L_1$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:msub><mml:mi>L<\/mml:mi><mml:mn>1<\/mml:mn><\/mml:msub><\/mml:math><\/jats:alternatives><\/jats:inline-formula>-RERM problems by an iterative approach. Furthermore, we experimentally showed that linear predictors of CBMs are effective in terms of prediction accuracy and knowledge discovery.<\/jats:p>","DOI":"10.1007\/s10618-019-00657-9","type":"journal-article","created":{"date-parts":[[2019,10,28]],"date-time":"2019-10-28T13:10:09Z","timestamp":1572268209000},"page":"101-123","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Grafting for combinatorial binary model using frequent itemset mining"],"prefix":"10.1007","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2781-7398","authenticated-orcid":false,"given":"Taito","family":"Lee","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shin","family":"Matsushima","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kenji","family":"Yamanishi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2019,10,28]]},"reference":[{"key":"657_CR1","unstructured":"Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large databases, pp 487\u2013499"},{"issue":"3","key":"657_CR2","first-page":"183","volume":"19","author":"H Aizenstein","year":"1995","unstructured":"Aizenstein H, Pitt L (1995) On the learnability of disjunctive normal form formulas. Mach Learn 19(3):183\u2013208","journal-title":"Mach Learn"},{"issue":"1","key":"657_CR3","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1186\/1471-2105-7-173","volume":"7","author":"V Andrew","year":"2006","unstructured":"Andrew V, Uzilov JMK, Mathews DH (2006) Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinform 7(1):173","journal-title":"BMC Bioinform"},{"key":"657_CR4","doi-asserted-by":"publisher","first-page":"4308","DOI":"10.1038\/ncomms5308","volume":"5","author":"P Baldi","year":"2014","unstructured":"Baldi P, Sadowski P, Whiteson D (2014) Searching for exotic particles in high-energy physics with deep learning. Nat Commun 5:4308","journal-title":"Nat Commun"},{"key":"657_CR5","unstructured":"Bayardo RJ Jr (1998) Efficiently mining long patterns from databases. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, pp 85\u201393"},{"key":"657_CR6","volume-title":"Pattern recognition and machine learning (information science and statistics)","author":"CM Bishop","year":"2006","unstructured":"Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag New York Inc., Secaucus"},{"key":"657_CR7","volume-title":"Classification and regression trees","author":"L Breiman","year":"1984","unstructured":"Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton"},{"issue":"1","key":"657_CR8","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1006\/inco.1995.1164","volume":"123","author":"NH Bshouty","year":"1995","unstructured":"Bshouty NH (1995) Exact learning boolean functions via the monotone theory. Inf Comput 123(1):146\u2013153","journal-title":"Inf Comput"},{"key":"657_CR9","doi-asserted-by":"crossref","unstructured":"Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD \u201916, pp 785\u2013794. https:\/\/doi.org\/10.1145\/2939672.2939785","DOI":"10.1145\/2939672.2939785"},{"key":"657_CR10","doi-asserted-by":"crossref","unstructured":"Cheng H, Yan X, Han J, Hsu CW (2007) Discriminative frequent pattern analysis for effective classification. In: Proceedings of 2007 IEEE 23rd international conference on data engineering. IEEE, pp 716\u2013725","DOI":"10.1109\/ICDE.2007.367917"},{"issue":"5","key":"657_CR11","doi-asserted-by":"publisher","first-page":"1105","DOI":"10.1162\/089976602753633402","volume":"14","author":"R Collobert","year":"2002","unstructured":"Collobert R, Bengio S, Bengio Y (2002) A parallel mixture of SVMs for very large scale problems. Neural Comput 14(5):1105\u20131114","journal-title":"Neural Comput"},{"issue":"1","key":"657_CR12","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1287\/opre.8.1.101","volume":"8","author":"GB Dantzig","year":"1960","unstructured":"Dantzig GB, Wolfe P (1960) Decomposition principle for linear programs. Oper Res 8(1):101\u2013111","journal-title":"Oper Res"},{"key":"657_CR13","volume-title":"Column generation","author":"G Desaulniers","year":"2006","unstructured":"Desaulniers G, Desrosiers J, Solomon MM (2006) Column generation, vol 5. Springer, Berlin"},{"issue":"8","key":"657_CR14","doi-asserted-by":"publisher","first-page":"1036","DOI":"10.1109\/TKDE.2005.127","volume":"17","author":"M Deshpande","year":"2005","unstructured":"Deshpande M, Kuramochi M, Wale N, Karypis G (2005) Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans Knowl Data Eng 17(8):1036\u20131050","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"Aug","key":"657_CR15","first-page":"1871","volume":"9","author":"RE Fan","year":"2008","unstructured":"Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9(Aug):1871\u20131874","journal-title":"J Mach Learn Res"},{"key":"657_CR16","first-page":"545","volume":"17","author":"I Guyon","year":"2005","unstructured":"Guyon I, Gunn S, Ben-Hur A, Dror G (2005) Result analysis of the NIPS 2003 feature selection challenge. Adv Neural Inf Process Syst 17:545\u2013552","journal-title":"Adv Neural Inf Process Syst"},{"key":"657_CR17","unstructured":"Ho TK (1995) Random decision forests. In: Proceedings of the third international conference on document analysis and recognition, vol 1. IEEE, pp 278\u2013282"},{"issue":"8","key":"657_CR18","doi-asserted-by":"publisher","first-page":"832","DOI":"10.1109\/34.709601","volume":"20","author":"TK Ho","year":"1998","unstructured":"Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832\u2013844","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"657_CR19","unstructured":"Ho TK, Kleinberg EM (1996) Building projectable classifiers of arbitrary complexity. In: Proceedings of the 13th international conference on pattern recognition, vol 2. IEEE, pp 880\u2013885"},{"key":"657_CR20","first-page":"729","volume":"17","author":"T Kudo","year":"2004","unstructured":"Kudo T, Maeda E, Matsumoto Y (2004) An application of boosting to graph classification. Adv Neural Inf Process Syst 17:729\u2013736","journal-title":"Adv Neural Inf Process Syst"},{"key":"657_CR21","unstructured":"Lichman M (2013) UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml . Accessed 30 Aug 2019"},{"key":"657_CR22","first-page":"4765","volume":"30","author":"SM Lundberg","year":"2017","unstructured":"Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. Adv Neural Inf Process Syst 30:4765\u20134774","journal-title":"Adv Neural Inf Process Syst"},{"key":"657_CR23","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"657_CR24","first-page":"1333","volume":"3","author":"S Perkins","year":"2003","unstructured":"Perkins S, Lacker K, Theiler J (2003) Grafting: fast, incremental feature selection by gradient descent in function space. J Mach Learn Res 3:1333\u20131356","journal-title":"J Mach Learn Res"},{"key":"657_CR25","unstructured":"Platt JC (1999) Advances in kernel methods. MIT Press, Cambridge, MA, USA. Chapter fast training of support vector machines using sequential minimal optimization, pp 185\u2013208"},{"key":"657_CR26","unstructured":"Prokhorov D (2001) IJCNN 2001 neural network competition. In: Slide presentation in international joint conference on neural networks 2001. http:\/\/www.geocities.ws\/ijcnn\/nnc_ijcnn01.pdf . Accessed 30 Aug 2019"},{"key":"657_CR27","volume-title":"C4.5: programs for machine learning","author":"JR Quinlan","year":"1993","unstructured":"Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, Burlington"},{"key":"657_CR28","doi-asserted-by":"crossref","unstructured":"Ribeiro MT, Singh S, Guestrin C (2016) Why should I trust you? Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135\u20131144","DOI":"10.1145\/2939672.2939778"},{"key":"657_CR29","doi-asserted-by":"publisher","DOI":"10.1201\/b17758","volume-title":"Sparse modeling: theory, algorithms, and applications","author":"I Rish","year":"2014","unstructured":"Rish I, Grabarnik G (2014) Sparse modeling: theory, algorithms, and applications, 1st edn. CRC Press Inc., Boca Raton","edition":"1"},{"issue":"18","key":"657_CR30","doi-asserted-by":"publisher","first-page":"2455","DOI":"10.1093\/bioinformatics\/btm353","volume":"23","author":"H Saigo","year":"2007","unstructured":"Saigo H, Uno T, Tsuda K (2007) Mining complex genotypic features for predicting HIV-1 drug resistance. Bioinformatics 23(18):2455\u20132462","journal-title":"Bioinformatics"},{"key":"657_CR31","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/8291.001.0001","volume-title":"Boosting: foundations and algorithms","author":"RE Schapire","year":"2012","unstructured":"Schapire RE, Freund Y (2012) Boosting: foundations and algorithms. The MIT Press, Cambridge"},{"key":"657_CR32","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511809682","volume-title":"Kernel methods for pattern analysis","author":"J Shawe-Taylor","year":"2004","unstructured":"Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, New York"},{"key":"657_CR33","doi-asserted-by":"crossref","unstructured":"Tsuda K, Kudo T (2006) Clustering graphs by weighted substructure mining. In: Proceedings of the 23rd international conference on Machine learning, pp 953\u2013960","DOI":"10.1145\/1143844.1143964"},{"key":"657_CR34","unstructured":"Uno T, Asai T, Uchida Y, Arimura H (2003) LCM: an efficient algorithm for enumerating frequent closed item sets. In: Proceedings of the third IEEE international conference on data mining workshop on frequent itemset mining implementations, available as CEUR workshop proceedings, vol 90. http:\/\/ceur-ws.org\/Vol-90\/ . Accessed 30 Aug 2019"},{"key":"657_CR35","unstructured":"Uno T, Kiyomi M, Arimura H (2004) LCM ver. 2: efficient mining algorithms for frequent\/closed\/maximal itemsets. In: Proceedings of the fourth IEEE international conference on data mining workshop on frequent itemset mining implementations, available as CEUR workshop proceedings, vol 126. http:\/\/ceur-ws.org\/Vol-126\/ . Accessed 30 Aug 2019"},{"key":"657_CR36","doi-asserted-by":"crossref","unstructured":"Uno T, Kiyomi M, Arimura H (2005) LCM ver. 3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In: Proceedings of the first international workshop on open source data mining: frequent pattern mining implementations, pp 77\u201386","DOI":"10.1145\/1133905.1133916"},{"key":"657_CR37","unstructured":"Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: Proceedings of the third international conference on knowledge discovery and data mining, pp 283\u2013286"}],"container-title":["Data Mining and Knowledge Discovery"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10618-019-00657-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10618-019-00657-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10618-019-00657-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,2]],"date-time":"2022-10-02T20:45:43Z","timestamp":1664743543000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10618-019-00657-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,10,28]]},"references-count":37,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,1]]}},"alternative-id":["657"],"URL":"https:\/\/doi.org\/10.1007\/s10618-019-00657-9","relation":{},"ISSN":["1384-5810","1573-756X"],"issn-type":[{"type":"print","value":"1384-5810"},{"type":"electronic","value":"1573-756X"}],"subject":[],"published":{"date-parts":[[2019,10,28]]},"assertion":[{"value":"13 August 2018","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 October 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 October 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}