{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T02:52:05Z","timestamp":1776394325785,"version":"3.51.2"},"reference-count":71,"publisher":"Association for Computing Machinery (ACM)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2018,6]]},"abstract":"<jats:p>\n            In exploratory data analysis, analysts often have a need to identify histograms that possess a specific distribution, among a large class of candidate histograms, e.g., find countries whose income distribution is most similar to that of Greece. This distribution could be a new one that the user is curious about, or a known distribution from an existing histogram visualization. At present, this process of identification is brute-force, requiring the manual generation and evaluation of a large number of histograms. We present FastMatch: an end-to-end approach for interactively retrieving the histogram visualizations most similar to a user-specified target, from a large collection of histograms. The primary technical contribution underlying FastMatch is a probabilistic algorithm, HistSim, a theoretically sound sampling-based approach to identify the top-\n            <jats:italic>k<\/jats:italic>\n            closest histograms under\n            <jats:italic>\u2113<\/jats:italic>\n            <jats:sub>1<\/jats:sub>\n            distance. While HistSim can be used independently, within FastMatch we couple HistSim with a novel system architecture that is aware of practical considerations, employing asynchronous block-based sampling policies. FastMatch obtains near-perfect accuracy with up to 35\u00d7 speedup over approaches that do not use sampling on several real-world datasets.\n          <\/jats:p>","DOI":"10.14778\/3231751.3231753","type":"journal-article","created":{"date-parts":[[2018,7,27]],"date-time":"2018-07-27T12:21:07Z","timestamp":1532694067000},"page":"1262-1275","source":"Crossref","is-referenced-by-count":19,"title":["Adaptive sampling for rapidly matching histograms"],"prefix":"10.14778","volume":"11","author":[{"given":"Stephen","family":"Macke","sequence":"first","affiliation":[{"name":"University of Illinois-Urbana Champaign"}]},{"given":"Yiming","family":"Zhang","sequence":"additional","affiliation":[{"name":"University of Illinois-Urbana Champaign"}]},{"given":"Silu","family":"Huang","sequence":"additional","affiliation":[{"name":"University of Illinois-Urbana Champaign"}]},{"given":"Aditya","family":"Parameswaran","sequence":"additional","affiliation":[{"name":"University of Illinois-Urbana Champaign"}]}],"member":"320","published-online":{"date-parts":[[2018,6]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"https:\/\/www.boost.org\/doc\/libs\/1_67_0\/libs\/math\/doc\/html\/dist.html","author":"Distributions Boost Statistical","year":"2006","unstructured":"Boost Statistical Distributions and Functions. https:\/\/www.boost.org\/doc\/libs\/1_67_0\/libs\/math\/doc\/html\/dist.html , 2006 . Boost Statistical Distributions and Functions. https:\/\/www.boost.org\/doc\/libs\/1_67_0\/libs\/math\/doc\/html\/dist.html, 2006."},{"key":"e_1_2_1_2_1","volume-title":"http:\/\/stat-computing.org\/dataexpo\/2009\/the-data.html","author":"Records Flight","year":"2009","unstructured":"Flight Records . http:\/\/stat-computing.org\/dataexpo\/2009\/the-data.html , 2009 . Flight Records. http:\/\/stat-computing.org\/dataexpo\/2009\/the-data.html, 2009."},{"key":"e_1_2_1_3_1","volume-title":"https:\/\/github.com\/toddwschneider\/nyc-taxi-data\/","author":"Taxi Trip Records NYC","year":"2015","unstructured":"NYC Taxi Trip Records . https:\/\/github.com\/toddwschneider\/nyc-taxi-data\/ , 2015 . NYC Taxi Trip Records. https:\/\/github.com\/toddwschneider\/nyc-taxi-data\/, 2015."},{"key":"e_1_2_1_4_1","volume-title":"https:\/\/stacks.stanford.edu\/file\/druid: py883nd2578\/WA-clean.csv.gz","author":"Police Stop Records WA","year":"2017","unstructured":"WA Police Stop Records . https:\/\/stacks.stanford.edu\/file\/druid: py883nd2578\/WA-clean.csv.gz , 2017 . WA Police Stop Records. https:\/\/stacks.stanford.edu\/file\/druid: py883nd2578\/WA-clean.csv.gz, 2017."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/335191.335450"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/304181.304581"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2465351.2465355"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939502.2939512"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/237814.237823"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872822"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.3150\/14-BEJ605"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/795666.796548"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1037\/1082-989X.2.2.131"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453895"},{"key":"e_1_2_1_15_1","volume-title":"Statistical inference","author":"Casella G.","year":"2002","unstructured":"G. Casella and R. L. Berger . Statistical inference , volume 2 . Duxbury Pacific Grove , CA , 2002 . G. Casella and R. L. Berger. Statistical inference, volume 2. Duxbury Pacific Grove, CA, 2002."},{"key":"e_1_2_1_16_1","volume-title":"Approximate query processing using wavelets. PVLDB, 10(2--3):199--223","author":"Chakrabarti K.","year":"2001","unstructured":"K. Chakrabarti , M. Garofalakis , R. Rastogi , and K. Shim . Approximate query processing using wavelets. PVLDB, 10(2--3):199--223 , 2001 . K. Chakrabarti, M. Garofalakis, R. Rastogi, and K. Shim. Approximate query processing using wavelets. PVLDB, 10(2--3):199--223, 2001."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/276305.276336"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/2634074.2634162"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.5555\/645484.656545"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/276305.276343"},{"key":"e_1_2_1_21_1","first-page":"617","volume-title":"Data Engineering, 2002. Proceedings. 18th International Conference on","author":"Chen C.-M.","year":"2002","unstructured":"C.-M. Chen and Y. Ling . A sampling-based estimator for top-k selection query . In Data Engineering, 2002. Proceedings. 18th International Conference on , pages 617 -- 627 . IEEE, 2002 . C.-M. Chen and Y. Ling. A sampling-based estimator for top-k selection query. In Data Engineering, 2002. Proceedings. 18th International Conference on, pages 617--627. IEEE, 2002."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3035921"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2008.04.021"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2013.31"},{"key":"e_1_2_1_25_1","unstructured":"I. Diakonikolas. Personal communication 2017.  I. Diakonikolas. Personal communication 2017."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915249"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2207676.2208294"},{"key":"e_1_2_1_28_1","first-page":"176","volume-title":"PVLDB","author":"Ganti V.","year":"2000","unstructured":"V. Ganti , M.-L. Lee , and R. Ramakrishnan . Icicles: Self-tuning samples for approximate query answering . PVLDB , pages 176 -- 187 , 2000 . V. Ganti, M.-L. Lee, and R. Ramakrishnan. Icicles: Self-tuning samples for approximate query answering. PVLDB, pages 176--187, 2000."},{"key":"e_1_2_1_29_1","volume-title":"International statistical review, 70(3):419--435","author":"Gibbs A. L.","year":"2002","unstructured":"A. L. Gibbs and F. E. Su . On choosing and bounding probability metrics . International statistical review, 70(3):419--435 , 2002 . A. L. Gibbs and F. E. Su. On choosing and bounding probability metrics. International statistical review, 70(3):419--435, 2002."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213902"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/253262.253291"},{"key":"e_1_2_1_32_1","volume-title":"Probability inequalities for sums of bounded random variables. Journal of the American statistical association, 58(301):13--30","author":"Hoeffding W.","year":"1963","unstructured":"W. Hoeffding . Probability inequalities for sums of bounded random variables. Journal of the American statistical association, 58(301):13--30 , 1963 . W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American statistical association, 58(301):13--30, 1963."},{"key":"e_1_2_1_33_1","first-page":"65","volume-title":"Scandinavian journal of statistics","author":"Holm S.","year":"1979","unstructured":"S. Holm . A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics , pages 65 -- 70 , 1979 . S. Holm. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65--70, 1979."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/66926.66933"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1391729.1391730"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/568271.223841"},{"key":"e_1_2_1_37_1","first-page":"174","volume-title":"PVLDB","author":"Ioannidis Y. E.","year":"1999","unstructured":"Y. E. Ioannidis and V. Poosala . Histogram-based approximation of set-valued query-answers . PVLDB , pages 174 -- 185 , 1999 . Y. E. Ioannidis and V. Poosala. Histogram-based approximation of set-valued query-answers. PVLDB, pages 174--185, 1999."},{"key":"e_1_2_1_38_1","first-page":"24","volume-title":"Optimal histograms with quality guarantees","author":"Jagadish H. V.","year":"1998","unstructured":"H. V. Jagadish , N. Koudas , S. Muthukrishnan , V. Poosala , K. C. Sevcik , and T. Suel . Optimal histograms with quality guarantees . pages 24 -- 27 , 1998 . H. V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. C. Sevcik, and T. Suel. Optimal histograms with quality guarantees. pages 24--27, 1998."},{"key":"e_1_2_1_39_1","doi-asserted-by":"crossref","DOI":"10.1002\/0471715816","volume-title":"Univariate discrete distributions","author":"Johnson N. L.","year":"2005","unstructured":"N. L. Johnson , A. W. Kemp , and S. Kotz . Univariate discrete distributions , volume 444 . John Wiley & Sons , 2005 . N. L. Johnson, A. W. Kemp, and S. Kotz. Univariate discrete distributions, volume 444. John Wiley & Sons, 2005."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732951.2732953"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2254556.2254659"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/2735479.2735485"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3209900.3209903"},{"key":"e_1_2_1_44_1","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1007\/978-3-540-71703-4_30","volume-title":"Probabilistic nearest-neighbor query on uncertain objects. Advances in databases: concepts, systems and applications","author":"Kriegel H.-P.","year":"2007","unstructured":"H.-P. Kriegel , P. Kunath , and M. Renz . Probabilistic nearest-neighbor query on uncertain objects. Advances in databases: concepts, systems and applications , pages 337 -- 348 , 2007 . H.-P. Kriegel, P. Kunath, and M. Renz. Probabilistic nearest-neighbor query on uncertain objects. Advances in databases: concepts, systems and applications, pages 337--348, 2007."},{"key":"e_1_2_1_45_1","volume-title":"Testing statistical hypotheses","author":"Lehmann E. L.","year":"2006","unstructured":"E. L. Lehmann and J. P. Romano . Testing statistical hypotheses . Springer Science & Business Media , 2006 . E. L. Lehmann and J. P. Romano. Testing statistical hypotheses. Springer Science & Business Media, 2006."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2013.179"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/93605.93611"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(93)90224-H"},{"issue":"12","key":"e_1_2_1_49_1","first-page":"2122","article-title":"The effects of interactive latency on exploratory visual analysis","volume":"20","author":"Liu Z.","year":"2014","unstructured":"Z. Liu and J. Heer . The effects of interactive latency on exploratory visual analysis . IEEE TVCG , 20 ( 12 ): 2122 -- 2131 , 2014 . Z. Liu and J. Heer. The effects of interactive latency on exploratory visual analysis. IEEE TVCG, 20(12):2122--2131, 2014.","journal-title":"IEEE TVCG"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12129"},{"key":"e_1_2_1_51_1","volume-title":"Available at: http:\/\/data-people.cs.illinois.edu\/papers\/fastmatch.pdf","author":"Macke S.","year":"2017","unstructured":"S. Macke , Y. Zhang , S. Huang , and A. Parameswaran . Fastmatch: Adaptive algorithms for rapid discovery of relevant histogram visualizations. Technical report , Available at: http:\/\/data-people.cs.illinois.edu\/papers\/fastmatch.pdf , 2017 . S. Macke, Y. Zhang, S. Huang, and A. Parameswaran. Fastmatch: Adaptive algorithms for rapid discovery of relevant histogram visualizations. Technical report, Available at: http:\/\/data-people.cs.illinois.edu\/papers\/fastmatch.pdf, 2017."},{"issue":"1","key":"e_1_2_1_52_1","first-page":"148","article-title":"On the method of bounded differences","volume":"141","author":"McDiarmid C.","year":"1989","unstructured":"C. McDiarmid . On the method of bounded differences . Surveys in combinatorics , 141 ( 1 ): 148 -- 188 , 1989 . C. McDiarmid. On the method of bounded differences. Surveys in combinatorics, 141(1):148--188, 1989.","journal-title":"Surveys in combinatorics"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3025453.3025456"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056098"},{"key":"e_1_2_1_55_1","volume-title":"Pthreads programming: A POSIX standard for better multiprocessing. \" O'Reilly Media","author":"Nichols B.","year":"1996","unstructured":"B. Nichols , D. Buttlar , and J. Farrell . Pthreads programming: A POSIX standard for better multiprocessing. \" O'Reilly Media , Inc .\", 1996 . B. Nichols, D. Buttlar, and J. Farrell. Pthreads programming: A POSIX standard for better multiprocessing. \" O'Reilly Media, Inc.\", 1996."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2016.7498287"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-010-0185-7"},{"key":"e_1_2_1_58_1","first-page":"2","volume-title":"Proceedings of international conference on intelligence analysis","volume":"5","author":"Pirolli P.","year":"2005","unstructured":"P. Pirolli and S. Card . The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis . In Proceedings of international conference on intelligence analysis , volume 5 , pages 2 -- 4 , 2005 . P. Pirolli and S. Card. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In Proceedings of international conference on intelligence analysis, volume 5, pages 2--4, 2005."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10619-013-7132-8"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137628.3137637"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2007.367934"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.14778\/3025111.3025126"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.10"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2007.367935"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.14778\/2831360.2831371"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/2688073.2688095"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.2172\/808915"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-69497-7_23"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807238"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824072"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2735381"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3231751.3231753","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T10:39:44Z","timestamp":1672223984000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3231751.3231753"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,6]]},"references-count":71,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2018,6]]}},"alternative-id":["10.14778\/3231751.3231753"],"URL":"https:\/\/doi.org\/10.14778\/3231751.3231753","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2018,6]]}}}