{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T06:29:48Z","timestamp":1760596188084,"version":"3.41.0"},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2006,3,1]],"date-time":"2006-03-01T00:00:00Z","timestamp":1141171200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Math. Softw."],"published-print":{"date-parts":[[2006,3]]},"abstract":"<jats:p>This article presents the design and implementation of a software tool, PROXIMUS, for error-bounded approximation of high-dimensional binary attributed datasets based on nonorthogonal decomposition of binary matrices. This tool can be used for analyzing data arising in a variety of domains ranging from commercial to scientific applications. Using a combination of innovative algorithms, novel data structures, and efficient implementation, PROXIMUS demonstrates excellent accuracy, performance, and scalability to large datasets. We experimentally demonstrate these on diverse applications in association rule mining and DNA microarray analysis. In limited beta release, PROXIMUS currently has over 300 installations in over 10 countries.<\/jats:p>","DOI":"10.1145\/1132973.1132976","type":"journal-article","created":{"date-parts":[[2006,7,25]],"date-time":"2006-07-25T14:14:26Z","timestamp":1153836866000},"page":"33-69","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":27,"title":["Nonorthogonal decomposition of binary matrices for bounded-error data compression and analysis"],"prefix":"10.1145","volume":"32","author":[{"given":"Mehmet","family":"Koyut\u00fcrk","sequence":"first","affiliation":[{"name":"Department of Computer Sciences, Purdue University, West Lafayette, IN"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ananth","family":"Grama","sequence":"additional","affiliation":[{"name":"Department of Computer Sciences, Purdue University, West Lafayette, IN"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Naren","family":"Ramakrishnan","sequence":"additional","affiliation":[{"name":"Department of Computer Sciences, Virginia Tech., Blackburg, VA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2006,3]]},"reference":[{"volume-title":"Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94)","author":"Agrawal R.","key":"e_1_2_1_1_1","unstructured":"Agrawal , R. and Srikant , R . 1994. Fast algorithms for mining association rules . In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94) . 487--499. Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94). 487--499."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1137\/1037127"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009740529316"},{"key":"e_1_2_1_4_1","unstructured":"Borgelt C. 1996. Finding association rules\/hyperedges with the apriori algorithm. http:\/\/fuzzy.cs.Uni-Magdeburg.de\/borgelt\/apriori\/apriori.html.  Borgelt C. 1996. Finding association rules\/hyperedges with the apriori algorithm. http:\/\/fuzzy.cs.Uni-Magdeburg.de\/borgelt\/apriori\/apriori.html."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/362342.362367"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the 4th SIAM International Conference on Data Mining (SDM","author":"Chi J.","year":"2004","unstructured":"Chi , J. , Koyut\u00fcrk , M. , and Grama , A . 2004. Conquest: A distributed tool for constructing summaries of high-dimensional discrete-attributed datasets . In Proceedings of the 4th SIAM International Conference on Data Mining (SDM 2004 ). 154--165. Chi, J., Koyut\u00fcrk, M., and Grama, A. 2004. Conquest: A distributed tool for constructing summaries of high-dimensional discrete-attributed datasets. In Proceedings of the 4th SIAM International Conference on Data Mining (SDM 2004). 154--165."},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1016\/S1097-2765(00)80114-8","article-title":"A genome-wide transcriptional analysis of the mitotic cell cycle","volume":"2","author":"Cho R. J.","year":"1998","unstructured":"Cho , R. J. , Campbell , M. J. , Winzeler , E. A. , Steinmetz , L. , Conway , A. , Wodicka , L. , Wolfsberg , T. G. , Gabrielian , A. E. , Landsman , D. , Lockhart , D. J. , and Davis , R. W. 1998 . A genome-wide transcriptional analysis of the mitotic cell cycle . Molecular Cell 2 , 1, 65 -- 73 . Cho, R. J., Campbell, M. J., Winzeler, E. A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T. G., Gabrielian, A. E., Landsman, D., Lockhart, D. J., and Davis, R. W. 1998. A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell 2, 1, 65--73.","journal-title":"Molecular Cell"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1137\/S0895479800382555"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Cover T. M. and Thomas J. A. 1991. Elements of Information Theory. Wiley & Sons New York.   Cover T. M. and Thomas J. A. 1991. Elements of Information Theory. Wiley & Sons New York.","DOI":"10.1002\/0471200611"},{"key":"e_1_2_1_10_1","unstructured":"Duff I. S. Erisman A. M. and Reid J. K. 1987. Direct Methods for Sparse Matrices. Clarendon Press New York.   Duff I. S. Erisman A. M. and Reid J. K. 1987. Direct Methods for Sparse Matrices. Clarendon Press New York."},{"volume-title":"Proceedings of the 24th Very Large Data Bases Conference.","author":"Gibson D.","key":"e_1_2_1_11_1","unstructured":"Gibson , D. , Kleingberg , J. , and Raghavan , P . 1998. Clustering categorical data: An approach based on dynamical systems . In Proceedings of the 24th Very Large Data Bases Conference. Gibson, D., Kleingberg, J., and Raghavan, P. 1998. Clustering categorical data: An approach based on dynamical systems. In Proceedings of the 24th Very Large Data Bases Conference."},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/MASSP.1984.1162229","article-title":"Vector quantization","volume":"1","author":"Gray R. M.","year":"1984","unstructured":"Gray , R. M. 1984 . Vector quantization . IEEE ASSP 1 , 2, 4 -- 29 . Gray, R. M. 1984. Vector quantization. IEEE ASSP 1, 2, 4--29.","journal-title":"IEEE ASSP"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4379(00)00022-3"},{"volume-title":"SPIE Proceedings.","author":"Gupta G.","key":"e_1_2_1_14_1","unstructured":"Gupta , G. and Ghosh , J . 2001. Value balanced agglomerative connectivity clustering . In SPIE Proceedings. Gupta, G. and Ghosh, J. 2001. Value balanced agglomerative connectivity clustering. In SPIE Proceedings."},{"key":"e_1_2_1_15_1","first-page":"15","article-title":"Hypergraph-based clustering in high-dimensional datasets: A summary of results","volume":"21","author":"Han E.","year":"1998","unstructured":"Han , E. , Karypis , G. , Kumar , V. , and Mobasher , B. 1998 . Hypergraph-based clustering in high-dimensional datasets: A summary of results . Bull. Tech. Committee Data Eng. 21 , 1, 15 -- 22 . Han, E., Karypis, G., Kumar, V., and Mobasher, B. 1998. Hypergraph-based clustering in high-dimensional datasets: A summary of results. Bull. Tech. Committee Data Eng. 21, 1, 15--22.","journal-title":"Bull. Tech. Committee Data Eng."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/360402.360421"},{"key":"e_1_2_1_17_1","unstructured":"Huang Z. 1997. A fast clustering algorithm to cluster very large categorical data sets in data mining. In Research Issues on Data Mining and Knowledge Discovery.  Huang Z. 1997. A fast clustering algorithm to cluster very large categorical data sets in data mining. In Research Issues on Data Mining and Knowledge Discovery."},{"key":"e_1_2_1_18_1","unstructured":"IBM. Quest synthetic data generation code. http:\/\/www.almaden.ibm.com\/cs\/quest\/syndata.html.  IBM. Quest synthetic data generation code. http:\/\/www.almaden.ibm.com\/cs\/quest\/syndata.html."},{"volume-title":"Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, KDD, E. Simoudis et al., eds. AAAI Press, 367--370","author":"John G. H.","key":"e_1_2_1_19_1","unstructured":"John , G. H. and Langley , P . 1996. Static versus dynamic sampling for data mining . In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, KDD, E. Simoudis et al., eds. AAAI Press, 367--370 . John, G. H. and Langley, P. 1996. Static versus dynamic sampling for data mining. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, KDD, E. Simoudis et al., eds. AAAI Press, 367--370."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1137\/S1064827595287997"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/291128.291131"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/358407.358424"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD","author":"Koyut\u00fcrk M.","year":"2003","unstructured":"Koyut\u00fcrk , M. and Grama , A . 2003. Proximus: A framework for analyzing very high-dimensional discrete attributed datasets . In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003 ). 147--156. 10.1145\/956750.956770 Koyut\u00fcrk, M. and Grama, A. 2003. Proximus: A framework for analyzing very high-dimensional discrete attributed datasets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003). 147--156. 10.1145\/956750.956770"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 2nd IEEE Computational Systems Bioinformatics Conference (CSB","author":"Koyut\u00fcrk M.","year":"2003","unstructured":"Koyut\u00fcrk , M. , Grama , A. , and Szpankowski , W . 2003. Algorithms for bounded-error correlation of high dimensional data in microarray experiments . In Proceedings of the 2nd IEEE Computational Systems Bioinformatics Conference (CSB 2003 ). 575--580. Koyut\u00fcrk, M., Grama, A., and Szpankowski, W. 2003. Algorithms for bounded-error correlation of high dimensional data in microarray experiments. In Proceedings of the 2nd IEEE Computational Systems Bioinformatics Conference (CSB 2003). 575--580."},{"volume-title":"Introduction to Scientific Computing","author":"Loan C. F. V.","key":"e_1_2_1_25_1","unstructured":"Loan , C. F. V. 2000. Introduction to Scientific Computing . Prentice Hall, Englewood Cliffs , N.J. Loan, C. F. V. 2000. Introduction to Scientific Computing. Prentice Hall, Englewood Cliffs, N.J."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 5th Berkeley Symposium","volume":"1","author":"MacQueen J.","year":"1967","unstructured":"MacQueen , J. 1967 . Some methods for classification and analysis of multivariate observations . In Proceedings of the 5th Berkeley Symposium , vol. 1 . 281--297. MacQueen, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium, vol. 1. 281--297."},{"key":"e_1_2_1_27_1","volume-title":"Tech. Rep. 2001-452, Dept. of Computing and Information Science","author":"McConnell S.","year":"2001","unstructured":"McConnell , S. and Skillicorn , D. B . 2001 . Outlier de tection using semi-discrete decomposition. Tech. Rep. 2001-452, Dept. of Computing and Information Science , Queen's University . McConnell, S. and Skillicorn, D. B. 2001. Outlier detection using semi-discrete decomposition. Tech. Rep. 2001-452, Dept. of Computing and Information Science, Queen's University."},{"key":"e_1_2_1_28_1","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1109\/TCOM.1983.1095823","article-title":"Digital image compression by outer product expansion","volume":"31","author":"O'Leary D. P.","year":"1983","unstructured":"O'Leary , D. P. and Peleg , S. 1983 . Digital image compression by outer product expansion . IEEE Trans. Commu. 31 , 441 -- 444 . O'Leary, D. P. and Peleg, S. 1983. Digital image compression by outer product expansion. IEEE Trans. Commu. 31, 441--444.","journal-title":"IEEE Trans. Commu."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:DAMI.0000026903.59233.2a"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0166-218X(03)00333-0"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1009876119989"},{"key":"e_1_2_1_32_1","unstructured":"Saad Y. 1996. Iterative Methods for Sparse Linear Systems. PWS.   Saad Y. 1996. Iterative Methods for Sparse Linear Systems. PWS."},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","first-page":"3273","DOI":"10.1091\/mbc.9.12.3273","article-title":"Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization","volume":"9","author":"Spellman P. T.","year":"1998","unstructured":"Spellman , P. T. , Sherlock , G. , Zhang , M. Q. , Iyer , V. R. , Anders , K. , Eisen , M. B. , Brown , P. O. , Botstein , D. , and Futcher , B. 1998 . Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization . Molecular Biology of the Cell 9 , 3273 -- 3297 . Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D., and Futcher, B. 1998. Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9, 3273--3297.","journal-title":"Molecular Biology of the Cell"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 22th International Conference on Very Large Databases (VLDB'96)","author":"Toivonen H.","year":"1996","unstructured":"Toivonen , H. 1996 . Sampling large databases for association rules . In Proceedings of the 22th International Conference on Very Large Databases (VLDB'96) . 134--145. Toivonen, H. 1996. Sampling large databases for association rules. In Proceedings of the 22th International Conference on Very Large Databases (VLDB'96). 134--145."},{"key":"e_1_2_1_35_1","volume-title":"Tech. Rep. TR617.","author":"Zaki M. J.","year":"1996","unstructured":"Zaki , M. J. , Parthasarathy , S. , Li , W. , and Ogihara , M . 1996 . Evaluation of sampling for data mining of association rules. Tech. Rep. TR617. Zaki, M. J., Parthasarathy, S., Li, W., and Ogihara, M. 1996. Evaluation of sampling for data mining of association rules. Tech. Rep. TR617."},{"key":"e_1_2_1_36_1","doi-asserted-by":"crossref","unstructured":"Zyto S. Grama A. and Szpankowski W. 2002. Semi-discrete matrix transforms (SDD) for image and video compression. In Process Coordination and Ubiquitous Computing D. Marinescu and C. Lee eds. Kluwer Amsterdam 249--259.   Zyto S. Grama A. and Szpankowski W. 2002. Semi-discrete matrix transforms (SDD) for image and video compression. In Process Coordination and Ubiquitous Computing D. Marinescu and C. Lee eds. Kluwer Amsterdam 249--259.","DOI":"10.1201\/9781003072492-19"}],"container-title":["ACM Transactions on Mathematical Software"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1132973.1132976","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1132973.1132976","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T15:06:13Z","timestamp":1750259173000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1132973.1132976"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,3]]},"references-count":36,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,3]]}},"alternative-id":["10.1145\/1132973.1132976"],"URL":"https:\/\/doi.org\/10.1145\/1132973.1132976","relation":{},"ISSN":["0098-3500","1557-7295"],"issn-type":[{"type":"print","value":"0098-3500"},{"type":"electronic","value":"1557-7295"}],"subject":[],"published":{"date-parts":[[2006,3]]},"assertion":[{"value":"2006-03-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}