{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,24]],"date-time":"2025-10-24T08:21:52Z","timestamp":1761294112133,"version":"3.41.0"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2020,5,3]],"date-time":"2020-05-03T00:00:00Z","timestamp":1588464000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["71731009, 2018WZDXM020; and 71433001"],"award-info":[{"award-number":["71731009, 2018WZDXM020; and 71433001"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"crossref","award":["R01GM118574 and R35GM134927"],"award-info":[{"award-number":["R01GM118574 and R35GM134927"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100010880","name":"State Grid Corporation of China","doi-asserted-by":"crossref","award":["5418-201958524A-0-0-00"],"award-info":[{"award-number":["5418-201958524A-0-0-00"]}],"id":[{"id":"10.13039\/501100010880","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Manage. Inf. Syst."],"published-print":{"date-parts":[[2020,6,30]]},"abstract":"<jats:p>Many applications use data that are better represented in the binary matrix form, such as click-stream data, market basket data, document-term data, user-permission data in access control, and others. Matrix factorization methods have been widely used tools for the analysis of high-dimensional data, as they automatically extract sparse and meaningful features from data vectors. However, existing matrix factorization methods do not work well for the binary data. One crucial limitation is interpretability, as many matrix factorization methods decompose an input matrix into matrices with fractional or even negative components, which are hard to interpret in many real settings. Some matrix factorization methods, like binary matrix factorization, do limit decomposed matrices to binary values. However, these models are not flexible to accommodate some data analysis tasks, like trading off summary size with quality and discriminating different types of approximation errors. To address those issues, this article presents weighted rank-one binary matrix factorization, which is to approximate a binary matrix by the product of two binary vectors, with parameters controlling different types of approximation errors. By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery. Theoretical properties on weighted rank-one binary matrix factorization are investigated and its connection to problems in other research domains are examined. As weighted rank-one binary matrix factorization in general is NP-hard, efficient and effective algorithms are presented. Extensive studies on applications of weighted rank-one binary matrix factorization are also conducted.<\/jats:p>","DOI":"10.1145\/3386599","type":"journal-article","created":{"date-parts":[[2020,5,5]],"date-time":"2020-05-05T01:50:48Z","timestamp":1588643448000},"page":"1-33","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Algorithms and Applications to Weighted Rank-one Binary Matrix Factorization"],"prefix":"10.1145","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0266-6191","authenticated-orcid":false,"given":"Haibing","family":"Lu","sequence":"first","affiliation":[{"name":"Santa Clara Unviersity, Santa Clara, CA USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xi","family":"Chen","sequence":"additional","affiliation":[{"name":"GEIRI North America, San Jose, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Junmin","family":"Shi","sequence":"additional","affiliation":[{"name":"New Jersey Institute of Technology, University Heights, Newark, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7420-6947","authenticated-orcid":false,"given":"Jaideep","family":"Vaidya","sequence":"additional","affiliation":[{"name":"Rutgers University, Washington Park, Newark, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vijayalakshmi","family":"Atluri","sequence":"additional","affiliation":[{"name":"Rutgers University, Washington Park, Newark, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuan","family":"Hong","sequence":"additional","affiliation":[{"name":"Illinois Institute of Technology, Chicago, IL, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Huang","sequence":"additional","affiliation":[{"name":"Southern University of Science 8 Technology; Xi\u2019an Jiaotong University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,5,3]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1976.1162766"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1287\/ijoc.1110.0461"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01587084"},{"key":"e_1_2_1_5_1","first-page":"194","article-title":"Set covering and involutory bases. Manage","volume":"18","author":"Bellmore Mandell","year":"1971","unstructured":"Mandell Bellmore and H. Donald Ratliff. 1971. Set covering and involutory bases. Manage. Sci. 18, 3 (1971), 194--206.","journal-title":"Sci."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1137\/1037127"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10044-007-0096-4"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1287\/ijoc.11.3.217"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2001.914857"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1287\/opre.47.5.730"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.4.3.233"},{"key":"e_1_2_1_12_1","unstructured":"M. Dawande P. Keskinocak and S. Tayur. 1997. On the biclique problem in bipartite graphs. GSIA working paper 1996-04 Carnegie-Mellon University."},{"key":"e_1_2_1_13_1","first-page":"1407","article-title":"The digitization of word of mouth: Promise and challenges of online feedback mechanisms. Manage","volume":"49","author":"Dellarocas Chrysanthos","year":"2003","unstructured":"Chrysanthos Dellarocas. 2003. The digitization of word of mouth: Promise and challenges of online feedback mechanisms. Manage. Sci. 49, 10 (2003), 1407--1424.","journal-title":"Sci."},{"volume-title":"Proceedings of the ACM Symposium on Access Control Models and Technologies (SACMAT\u201908)","author":"Ene Alina","key":"e_1_2_1_14_1","unstructured":"Alina Ene, William Horne, Nikola Milosavljevic, Prasad Rao, Robert Schreiber, and Robert E. Tarjan. 2008. Fast exact and heuristic methods for role minimization problems. In Proceedings of the ACM Symposium on Access Control Models and Technologies (SACMAT\u201908). 1--10."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/501978.501980"},{"key":"e_1_2_1_16_1","first-page":"1","article-title":"Multi-assignment clustering for Boolean data","volume":"13","author":"Frank Mario","year":"2012","unstructured":"Mario Frank, Andreas P. Streich, David Basin, and Joachim M. Buhmann. 2012. Multi-assignment clustering for Boolean data. J. Mach. Learn. Res. 13, 1 (Feb. 2012), 459--489.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bti653"},{"volume-title":"Discovery Science","author":"Geerts Floris","key":"e_1_2_1_18_1","unstructured":"Floris Geerts, Bart Goethals, and Taneli Mielikainen. 2004. Tiling databases. In Discovery Science. Springer, 278--289."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2821146"},{"key":"e_1_2_1_20_1","first-page":"336","article-title":"Adaptive memory tabu search for binary quadratic programs. Manage","volume":"44","author":"Glover Fred","year":"1998","unstructured":"Fred Glover, Gary A. Kochenberger, and Bahram Alidaee. 1998. Adaptive memory tabu search for binary quadratic programs. Manage. Sci. 44, 3 (1998), 336--345.","journal-title":"Sci."},{"key":"e_1_2_1_21_1","volume-title":"Matrix Computations","author":"Golub G. H.","unstructured":"G. H. Golub and C. F. Van Loan. 1996. Matrix Computations (3rd ed.). The Johns Hopkins Universtiy Press, Baltimore, MD.","edition":"3"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/3067421.3067425"},{"key":"e_1_2_1_23_1","article-title":"Structured low-rank matrix factorization: Global optimality, algorithms, and applications","volume":"14","author":"Haeffele Benjamin David","year":"2019","unstructured":"Benjamin David Haeffele and Ren\u00e9 Vidal. 2019. Structured low-rank matrix factorization: Global optimality, algorithms, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 14, 8 (2019).","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01585160"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijar.2006.01.004"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2007.07.038"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/358407.358424"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956770"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2005.55"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/775412.775435"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.1987.4309069"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-1-55860-377-6.50048-7"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1038\/44565"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1081870.1081894"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611974317.7"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.3233\/JCS-130484"},{"key":"e_1_2_1_37_1","unstructured":"Haibing Lu J. Vaidya and V. Atluri. [n.d.] An optimization framework for role mining. (unpublished)."},{"key":"e_1_2_1_38_1","first-page":"5","article-title":"Constraint-aware role mining via extended Boolean matrix decomposition","volume":"9","author":"Lu Haibing","year":"2012","unstructured":"Haibing Lu, J. Vaidya, V. Atluri, and Yuan Hong. 2012. Constraint-aware role mining via extended Boolean matrix decomposition. IEEE Trans. Depend. Sec. Comput. 9, 5 (Sept.-Oct. 2012), 655--669.","journal-title":"IEEE Trans. Depend. Sec. Comput."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972818.25"},{"volume-title":"Introduction to Information Retrieval","author":"Manning Christopher D.","key":"e_1_2_1_40_1","unstructured":"Christopher D. Manning, Prabhakar Raghavan, and Hinrich Sch\u00fctze. 2008. Introduction to Information Retrieval. Cambridge University Press, New York, NY."},{"key":"e_1_2_1_41_1","volume-title":"Big Data: The Next Frontier for Innovation, Competition, and Productivity. Technical Report","author":"Manyika James","year":"2011","unstructured":"James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela Hung Byers. 2011. Big Data: The Next Frontier for Innovation, Competition, and Productivity. Technical Report. McKinsey Global Institute."},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the Genetic and Evolutionary Computation Conference (GECCO\u201999)","author":"Merz Peter","year":"1999","unstructured":"Peter Merz and Bernd Freisleben. 1999. Genetic algorithms for binary quadratic programming. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO\u201999). Morgan Kauffman, 417--424."},{"key":"e_1_2_1_43_1","first-page":"2","article-title":"Greedy and local search heuristics for unconstrained binary quadratic programming","volume":"8","author":"Bernd Freisleben Peter","year":"2002","unstructured":"Peter MerZ and Bernd Freisleben. 2002. Greedy and local search heuristics for unconstrained binary quadratic programming. J. Heurist. 8, 2 (Mar. 2002), 197--213.","journal-title":"J. Heurist."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2008.53"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0166-218X(03)00333-0"},{"volume-title":"Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics. 546431","author":"Schein Andrew I.","key":"e_1_2_1_46_1","unstructured":"Andrew I. Schein, Lawrence K. Saul, and Lyle H. Ungar. 2003. A generalized linear model for principal component analysis of binary data. In Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics. 546431."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772862"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2004.11.005"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557103"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335376"},{"key":"e_1_2_1_51_1","volume-title":"Noisy-OR component analysis and its application to link analysis. J. Mach. Learn. Res. 7 (Dec","author":"Singliar Tom\u00e1\u0161","year":"2006","unstructured":"Tom\u00e1\u0161 Singliar and Milo\u0161 Hauskrecht. 2006. Noisy-OR component analysis and its application to link analysis. J. Mach. Learn. Res. 7 (Dec. 2006), 2189--2213."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1287\/ijoc.15.2.208.14448"},{"key":"e_1_2_1_53_1","volume-title":"Introduction to Data Mining","author":"Tan Pang-Ning","unstructured":"Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2005. Introduction to Data Mining (1st ed.). Addison-Wesley Longman, Boston, MA.","edition":"1"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/1266840.1266870"},{"key":"e_1_2_1_55_1","volume-title":"Proceedings of the European Symposium on Artifical Neural Networks (ESANN\u201912)","volume":"12","author":"Vellido Alfredo","unstructured":"Alfredo Vellido, Jos\u00e9 David Mart\u00edn-Guerrero, and Paulo J. G. Lisboa. 2012. Making machine learning models interpretable. In Proceedings of the European Symposium on Artifical Neural Networks (ESANN\u201912), Vol. 12. Citeseer, 163--172."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2007.99"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-009-0145-2"}],"container-title":["ACM Transactions on Management Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3386599","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3386599","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:38:19Z","timestamp":1750199899000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3386599"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,3]]},"references-count":56,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,6,30]]}},"alternative-id":["10.1145\/3386599"],"URL":"https:\/\/doi.org\/10.1145\/3386599","relation":{},"ISSN":["2158-656X","2158-6578"],"issn-type":[{"type":"print","value":"2158-656X"},{"type":"electronic","value":"2158-6578"}],"subject":[],"published":{"date-parts":[[2020,5,3]]},"assertion":[{"value":"2017-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-05-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}