{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T04:31:05Z","timestamp":1759638665443,"version":"3.41.0"},"reference-count":43,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2015,5,6]],"date-time":"2015-05-06T00:00:00Z","timestamp":1430870400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. ACM"],"published-print":{"date-parts":[[2015,5,6]]},"abstract":"<jats:p>\n            Given a set of\n            <jats:italic>n<\/jats:italic>\n            <jats:italic>d<\/jats:italic>\n            -dimensional Boolean vectors with the promise that the vectors are chosen uniformly at random with the exception of two vectors that have Pearson correlation coefficient \u03c1 (Hamming distance\n            <jats:italic>d<\/jats:italic>\n            \u00b7 1-\u03c1\/ 2), how quickly can one find the two correlated vectors? We present an algorithm which, for any constant \u03f5&gt;0, and constant \u03c1&gt;0, runs in expected time\n            <jats:italic>O<\/jats:italic>\n            (n\n            <jats:sup>5-\u03c9 \/ 4-\u03c9+\u03f5<\/jats:sup>\n            +nd) &lt;\n            <jats:italic>O<\/jats:italic>\n            (n\n            <jats:sup>1.62<\/jats:sup>\n            +nd), where \u03c9 &lt; 2.4 is the exponent of matrix multiplication. This is the first subquadratic--time algorithm for this problem for which \u03c1 does not appear in the exponent of\n            <jats:italic>n<\/jats:italic>\n            , and improves upon\n            <jats:italic>O<\/jats:italic>\n            (n\n            <jats:sup>2-O(\u03c1)<\/jats:sup>\n            ), given by Paturi et al. [1989], the Locality Sensitive Hashing approach of Motwani [1998] and the Bucketing Codes approach of Dubiner [2008].\n          <\/jats:p>\n          <jats:p>Applications and extensions of this basic algorithm yield significantly improved algorithms for several other problems.<\/jats:p>\n          <jats:p>\n            <jats:italic>Approximate Closest Pair.<\/jats:italic>\n            For any sufficiently small constant \u03f5&gt;0, given\n            <jats:italic>n<\/jats:italic>\n            <jats:italic>d<\/jats:italic>\n            -dimensional vectors, there exists an algorithm that returns a pair of vectors whose Euclidean (or Hamming) distance differs from that of the closest pair by a factor of at most 1+\u03f5, and runs in time\n            <jats:italic>O<\/jats:italic>\n            (n\n            <jats:sup>2-\u0398(\u221a\u03f5<\/jats:sup>\n            )). The best previous algorithms (including Locality Sensitive Hashing) have runtime\n            <jats:italic>O<\/jats:italic>\n            (n\n            <jats:sup>2-O(\u03f5)<\/jats:sup>\n            ).\n          <\/jats:p>\n          <jats:p>\n            <jats:italic>Learning Sparse Parities with Noise.<\/jats:italic>\n            Given samples from an instance of the learning parities with noise problem where each example has length\n            <jats:italic>n<\/jats:italic>\n            , the true parity set has size at most\n            <jats:italic>k<\/jats:italic>\n            \u00ab\n            <jats:italic>n<\/jats:italic>\n            , and the noise rate is \u03b7, there exists an algorithm that identifies the set of\n            <jats:italic>k<\/jats:italic>\n            indices in time\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>\u03c9+\u03f5\/3 k<\/jats:sup>\n            <jats:italic>poly(1\/1-2\u03b7)<\/jats:italic>\n            &lt;\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>0.8k<\/jats:sup>\n            poly(1\/1-2 \u03b7). This is the first algorithm with no dependence on \u03b7 in the exponent of\n            <jats:italic>n<\/jats:italic>\n            , aside from the trivial\n            <jats:italic>O<\/jats:italic>\n            ((\n            <jats:sup>n<\/jats:sup>\n            <jats:sub>k<\/jats:sub>\n            )) \u2248\n            <jats:italic>O<\/jats:italic>\n            (\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>k<\/jats:sup>\n            ) brute-force algorithm, and for large noise rates (\u03b7 &gt; 0.4), improves upon the results of Grigorescu et al. [2011] that give a runtime of\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>(1+(2 \u03b7)<\/jats:sup>\n            <jats:sup>2<\/jats:sup>\n            +\n            <jats:italic>o<\/jats:italic>\n            (1))k\/2\n            <jats:italic>poly<\/jats:italic>\n            (1\/1-2\u03b7).\n          <\/jats:p>\n          <jats:p>\n            <jats:italic>\n              Learning\n              <jats:italic>k<\/jats:italic>\n              -Juntas with Noise.\n            <\/jats:italic>\n            Given uniformly random length\n            <jats:italic>n<\/jats:italic>\n            Boolean vectors, together with a label, which is some function of just\n            <jats:italic>k<\/jats:italic>\n            \u00ab\n            <jats:italic>n<\/jats:italic>\n            of the bits, perturbed by noise rate \u03b7, return the set of relevant indices. Leveraging the reduction of Feldman et al. [2009], our result for learning\n            <jats:italic>k<\/jats:italic>\n            -parities implies an algorithm for this problem with runtime\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>\u03c9+\u03f5\/3 k<\/jats:sup>\n            <jats:italic>poly<\/jats:italic>\n            (1\/1-2\u03b7) &lt;\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>0.8k<\/jats:sup>\n            <jats:italic>poly<\/jats:italic>\n            (1\/1-2 \u03b7), which is the first runtime for this problem of the form\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>ck<\/jats:sup>\n            with an absolute constant\n            <jats:italic>c<\/jats:italic>\n            &lt; 1.\n          <\/jats:p>\n          <jats:p>\n            <jats:italic>\n              Learning\n              <jats:italic>k<\/jats:italic>\n              -Juntas without Noise.\n            <\/jats:italic>\n            Given uniformly random length\n            <jats:italic>n<\/jats:italic>\n            Boolean vectors, together with a label, which is some function of\n            <jats:italic>k<\/jats:italic>\n            \u00ab\n            <jats:italic>n<\/jats:italic>\n            of the bits, return the set of relevant indices. Using a modification of the algorithm of Mossel et al. [2004], and employing our algorithm for learning sparse parities with noise via the reduction of Feldman et al. [2009], we obtain an algorithm for this problem with runtime\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>\u03c9+ \u03f5\/4 k<\/jats:sup>\n            <jats:italic>poly(n)<\/jats:italic>\n            &lt;\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>0.6k<\/jats:sup>\n            <jats:italic>poly(n)<\/jats:italic>\n            , which improves on the previous best of\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>\u03c9+1\/\u03c9k<\/jats:sup>\n            \u2248\n            <jats:italic>n<\/jats:italic>\n            <jats:sup>0.7k<\/jats:sup>\n            <jats:italic>poly(n)<\/jats:italic>\n            of Mossel et al. [2004].\n          <\/jats:p>","DOI":"10.1145\/2728167","type":"journal-article","created":{"date-parts":[[2015,5,11]],"date-time":"2015-05-11T16:30:57Z","timestamp":1431361857000},"page":"1-45","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":28,"title":["Finding Correlations in Subquadratic Time, with Applications to Learning Parities and the Closest Pair Problem"],"prefix":"10.1145","volume":"62","author":[{"given":"Gregory","family":"Valiant","sequence":"first","affiliation":[{"name":"Stanford University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2015,5,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/380752.380857"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007352.1007371"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2006.49"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327494"},{"volume-title":"Proceedings of the International Colloquium on Automata, Languages and Programming (ICALP). 403--415","author":"Arora S.","key":"e_1_2_1_5_1","unstructured":"S. Arora and R. Ge . 2011. New algorithms for learning in presence of errors . In Proceedings of the International Colloquium on Automata, Languages and Programming (ICALP). 403--415 . S. Arora and R. Ge. 2011. New algorithms for learning in presence of errors. In Proceedings of the International Colloquium on Automata, Languages and Programming (ICALP). 403--415."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/361002.361007"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/195058.195147"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/792538.792543"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2011.12"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/509907.509965"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1137\/0217052"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcom.1997.0438"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/997817.997857"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2010.2050814"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1137\/070684914"},{"volume-title":"Proceedings of the 22nd International Conference on Algorithmic Learning Theory (ALT).","author":"Grigorescu E.","key":"e_1_2_1_16_1","unstructured":"E. Grigorescu , L. Reyzin , and S. Vempala . 2011. On noise-tolerant learning of sparse parities and related problems . In Proceedings of the 22nd International Conference on Algorithmic Learning Theory (ALT). E. Grigorescu, L. Reyzin, and S. Vempala. 2011. On noise-tolerant learning of sparse parities and related problems. In Proceedings of the 22nd International Conference on Algorithmic Learning Theory (ALT)."},{"volume-title":"Proceedings of the ASIACRYPT. 52--66","author":"Hopper N. J.","key":"e_1_2_1_17_1","unstructured":"N. J. Hopper and M. Blum . 2001. Secure human identification protocols . In Proceedings of the ASIACRYPT. 52--66 . N. J. Hopper and M. Blum. 2001. Secure human identification protocols. In Proceedings of the ASIACRYPT. 52--66."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/SFCS.1989.63486"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276876"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/293347.293351"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1137\/S0097539798347177"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/11538462_32"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1038\/ng1537"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1006\/inco.1993.1057"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2004.04.002"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1137856.1137881"},{"volume-title":"Proceedings of the Innovations in Theoretical Computer Science Conference (ITCS). 275--283","author":"O'Donnell R.","key":"e_1_2_1_27_1","unstructured":"R. O'Donnell , Y. Wu , and Y. Zhou . 2011. Optimal lower bounds for locality sensitive hashing (except when q is tiny) . In Proceedings of the Innovations in Theoretical Computer Science Conference (ITCS). 275--283 . R. O'Donnell, Y. Wu, and Y. Zhou. 2011. Optimal lower bounds for locality sensitive hashing (except when q is tiny). In Proceedings of the Innovations in Theoretical Computer Science Conference (ITCS). 275--283."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2090236.2090271"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/1109557.1109688"},{"volume-title":"Proceedings of the Conference on Learning Theory (COLT). 261--268","author":"Paturi Ramamohan","key":"e_1_2_1_30_1","unstructured":"Ramamohan Paturi , Sanguthevar Rajasekaran , and John H. Reif . 1989. The light bulb problem . In Proceedings of the Conference on Learning Theory (COLT). 261--268 . Ramamohan Paturi, Sanguthevar Rajasekaran, and John H. Reif. 1989. The light bulb problem. In Proceedings of the Conference on Learning Theory (COLT). 261--268."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1536414.1536461"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1568318.1568324"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCC.2010.26"},{"volume-title":"The Chebyshev Polynomials","author":"Rivlin T. J.","key":"e_1_2_1_34_1","unstructured":"T. J. Rivlin . 1974. The Chebyshev Polynomials . Wiley . T. J. Rivlin. 1974. The Chebyshev Polynomials. Wiley."},{"volume-title":"Foundations of Multidimensional and Metric Data Structures","author":"Samet H.","key":"e_1_2_1_35_1","unstructured":"H. Samet . 2006. Foundations of Multidimensional and Metric Data Structures . Elsevier . H. Samet. 2006. Foundations of Multidimensional and Metric Data Structures. Elsevier."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1215\/S0012-7094-42-00908-6"},{"key":"e_1_2_1_37_1","volume-title":"Orthogonal Polynomials","author":"Szeg\u00f6 G.","unstructured":"G. Szeg\u00f6 . 1975. Orthogonal Polynomials , 4 th Ed. American Mathematical Society , Colloquium Publications , 23. Providence, RI. G. Szeg\u00f6. 1975. Orthogonal Polynomials, 4th Ed. American Mathematical Society, Colloquium Publications, 23. Providence, RI.","edition":"4"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2012.27"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 1st Workshop on Computational Learning Theory. 28--39","author":"Valiant L.","year":"1988","unstructured":"L. Valiant . 1988 . Functionality in neural nets . In Proceedings of the 1st Workshop on Computational Learning Theory. 28--39 . L. Valiant. 1988. Functionality in neural nets. In Proceedings of the 1st Workshop on Computational Learning Theory. 28--39."},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the Conference on Learning Theory (COLT). 314--326","author":"Verbeurgt K. A.","year":"1990","unstructured":"K. A. Verbeurgt . 1990 . Learning DNF under the uniform distribution in quasipolynomial time . In Proceedings of the Conference on Learning Theory (COLT). 314--326 . K. A. Verbeurgt. 1990. Learning DNF under the uniform distribution in quasipolynomial time. In Proceedings of the Conference on Learning Theory (COLT). 314--326."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btq486"},{"volume-title":"Proceedings of the 24th International Conference on Very Large Databases (VLDB).","author":"Weber R.","key":"e_1_2_1_42_1","unstructured":"R. Weber , H. J. Schek , and S. Blott . 1998. A quantitative analysis and performance study for similarity--search methods in high--dimensional spaces . In Proceedings of the 24th International Conference on Very Large Databases (VLDB). R. Weber, H. J. Schek, and S. Blott. 1998. A quantitative analysis and performance study for similarity--search methods in high--dimensional spaces. In Proceedings of the 24th International Conference on Very Large Databases (VLDB)."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213977.2214056"}],"container-title":["Journal of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2728167","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2728167","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:16:32Z","timestamp":1750227392000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2728167"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,5,6]]},"references-count":43,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2015,5,6]]}},"alternative-id":["10.1145\/2728167"],"URL":"https:\/\/doi.org\/10.1145\/2728167","relation":{},"ISSN":["0004-5411","1557-735X"],"issn-type":[{"type":"print","value":"0004-5411"},{"type":"electronic","value":"1557-735X"}],"subject":[],"published":{"date-parts":[[2015,5,6]]},"assertion":[{"value":"2013-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-05-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}