{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,11]],"date-time":"2026-06-11T20:52:32Z","timestamp":1781211152027,"version":"3.54.1"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2020,8]]},"abstract":"<jats:p>\n            With the explosive growth of unstructured data (such as images, videos, and audios), unstructured data analytics is widespread in a rich vein of real-world applications. Many database systems start to incorporate unstructured data analysis to meet such demands. However, queries over unstructured and structured data are often treated as disjoint tasks in most systems, where\n            <jats:italic toggle=\"yes\">hybrid queries<\/jats:italic>\n            (\n            <jats:italic toggle=\"yes\">i.e.<\/jats:italic>\n            , involving both data types) are not yet fully supported.\n          <\/jats:p>\n          <jats:p>\n            In this paper, we present a hybrid analytic engine developed at Alibaba, named\n            <jats:italic toggle=\"yes\">AnalyticDB-V<\/jats:italic>\n            (ADBV), to fulfill such emerging demands. ADBV offers an interface that enables users to express\n            <jats:italic toggle=\"yes\">hybrid queries<\/jats:italic>\n            using SQL semantics by converting unstructured data to high dimensional vectors. ADBV adopts the\n            <jats:italic toggle=\"yes\">lambda<\/jats:italic>\n            framework and leverages the merits of approximate nearest neighbor search (ANNS) techniques to support hybrid data analytics. Moreover, a novel ANNS algorithm is proposed to improve the accuracy on large-scale vectors representing massive unstructured data. All ANNS algorithms are implemented as physical operators in ADBV, meanwhile, accuracy-aware cost-based optimization techniques are proposed to identify effective execution plans. Experimental results on both public and in-house datasets show the superior performance achieved by ADBV and its effectiveness. ADBV has been successfully deployed on Alibaba Cloud to provide hybrid query processing services for various real-world applications.\n          <\/jats:p>","DOI":"10.14778\/3415478.3415541","type":"journal-article","created":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T18:46:46Z","timestamp":1600109206000},"page":"3152-3165","source":"Crossref","is-referenced-by-count":117,"title":["AnalyticDB-V"],"prefix":"10.14778","volume":"13","author":[{"given":"Chuangxian","family":"Wei","sequence":"first","affiliation":[{"name":"Alibaba Group"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bin","family":"Wu","sequence":"additional","affiliation":[{"name":"Alibaba Group"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sheng","family":"Wang","sequence":"additional","affiliation":[{"name":"Alibaba Group"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Renjie","family":"Lou","sequence":"additional","affiliation":[{"name":"Alibaba Group"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chaoqun","family":"Zhan","sequence":"additional","affiliation":[{"name":"Alibaba Group"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Feifei","family":"Li","sequence":"additional","affiliation":[{"name":"Alibaba Group"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuanzhe","family":"Cai","sequence":"additional","affiliation":[{"name":"Alibaba Group"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2020,8]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Greenplum. https:\/\/greenplum.org\/."},{"key":"e_1_2_1_2_1","unstructured":"Pangu. https:\/\/www.alibabacloud.com\/blog\/pangu---the-high-performance-distributed-file-system-by-alibaba-cloud 594059."},{"key":"e_1_2_1_3_1","first-page":"1383","volume-title":"SIGMOD","author":"Armbrust M.","year":"2015","unstructured":"M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan, M. J. Franklin, A. Ghodsi, et al. Spark sql: Relational data processing in spark. In SIGMOD, pages 1383--1394. ACM, 2015."},{"key":"e_1_2_1_4_1","volume-title":"The inverted multi-index","author":"Babenko A.","year":"2014","unstructured":"A. Babenko and V. Lempitsky. The inverted multi-index. IEEE transactions on pattern analysis and machine intelligence, 37(6):1247--1260, 2014."},{"key":"e_1_2_1_5_1","first-page":"2055","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Babenko A.","year":"2016","unstructured":"A. Babenko and V. Lempitsky. Efficient indexing of billion-scale datasets of deep descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2055--2063, 2016."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/361002.361007"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-68234-9_41"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00384"},{"key":"e_1_2_1_9_1","volume-title":"Jan.","year":"2019","unstructured":"Elasticsearch. Elasticsearch Approximate Nearest Neighbor plugin, Jan. 2019."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.14778\/3303753.3303754"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.379"},{"key":"e_1_2_1_12_1","first-page":"518","volume-title":"PVLDB","volume":"99","author":"Gionis A.","year":"1999","unstructured":"A. Gionis, P. Indyk, R. Motwani, et al. Similarity search in high dimensions via hashing. In PVLDB, volume 99, pages 518--529, 1999."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/MASSP.1984.1162229"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.6028\/NIST.IR.8271"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742795"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/602259.602266"},{"key":"e_1_2_1_17_1","series-title":"Series C (Applied Statistics), 28(1):100--108","volume-title":"Algorithm as 136: A k-means clustering algorithm. Journal of the Royal Statistical Society","author":"Hartigan J. A.","year":"1979","unstructured":"J. A. Hartigan and M. A. Wong. Algorithm as 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1):100--108, 1979."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2015.03.016"},{"key":"e_1_2_1_19_1","volume-title":"A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys (CSUR), 40(4):11","author":"Ilyas I. F.","year":"2008","unstructured":"I. F. Ilyas, G. Beskales, and M. A. Soliman. A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys (CSUR), 40(4):11, 2008."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007593"},{"key":"e_1_2_1_21_1","first-page":"102","volume-title":"Symposium on Computational Geometry","author":"Indyk P.","year":"2002","unstructured":"P. Indyk. Approximate nearest neighbor algorithms for fr\u00e9chet distance via product metrics. In Symposium on Computational Geometry, pages 102--106. Citeseer, 2002."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276876"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2010.57"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2011.5946540"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.298"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1038\/35022643"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/2367502.2367518"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066173"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3284028.3284030"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2016.7553002"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2013.10.006"},{"key":"e_1_2_1_34_1","volume-title":"Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs","author":"Malkov Y. A.","year":"2018","unstructured":"Y. A. Malkov and D. A. Yashunin. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence, 2018."},{"key":"e_1_2_1_35_1","volume-title":"Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs","author":"Malkov Y. A.","year":"2018","unstructured":"Y. A. Malkov and D. A. Yashunin. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence, 2018."},{"key":"e_1_2_1_36_1","volume-title":"Alibaba's 'City Brain' is slashing congestion in its hometown. https:\/\/edition.cnn.com\/2019\/01\/15\/tech\/alibaba-city-brain-hangzhou\/index.html\/","author":"Michelle T.","year":"2019","unstructured":"T. Michelle and E. Leonie. Alibaba's 'City Brain' is slashing congestion in its hometown. https:\/\/edition.cnn.com\/2019\/01\/15\/tech\/alibaba-city-brain-hangzhou\/index.html\/, 2019. [Online; accessed 2-March-2020]."},{"key":"e_1_2_1_37_1","volume-title":"Towards practical visual search engine within elasticsearch. arXiv preprint arXiv:1806.08896","author":"Mu C.","year":"2018","unstructured":"C. Mu, J. Zhao, G. Yang, J. Zhang, and Z. Yan. Towards practical visual search engine within elasticsearch. arXiv preprint arXiv:1806.08896, 2018."},{"key":"e_1_2_1_38_1","volume-title":"International Computer Science Institute Berkeley","author":"Omohundro S. M.","year":"1989","unstructured":"S. M. Omohundro. Five balltree construction algorithms. International Computer Science Institute Berkeley, 1989."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/45.5.494"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/971697.602294"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_2_1_42_1","volume-title":"A unified deep neural network for speaker and language recognition. arXiv preprint arXiv:1504.00923","author":"Richardson F.","year":"2015","unstructured":"F. Richardson, D. Reynolds, and N. Dehak. A unified deep neural network for speaker and language recognition. arXiv preprint arXiv:1504.00923, 2015."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/582318.582321"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/988672.988723"},{"key":"e_1_2_1_45_1","volume-title":"Image retrieval: Current techniques, promising directions, and open issues. Journal of visual communication and image representation, 10:39--62","author":"Rui Y.","year":"1999","unstructured":"Y. Rui, T. S. Huang, and S.-F. Chang. Image retrieval: Current techniques, promising directions, and open issues. Journal of visual communication and image representation, 10:39--62, 1999."},{"key":"e_1_2_1_46_1","first-page":"2018","article-title":"An inside look at google bigquery.(2012)","volume":"29","author":"Sato K.","year":"2012","unstructured":"K. Sato. An inside look at google bigquery.(2012). Retrieved Jan, 29:2018, 2012.","journal-title":"Retrieved Jan"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/16856.16888"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15286-3_16"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/CICN.2011.15"},{"key":"e_1_2_1_51_1","first-page":"194","volume-title":"PVLDB","volume":"98","author":"Weber R.","year":"1998","unstructured":"R. Weber, H.-J. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In PVLDB, volume 98, pages 194--205, 1998."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/1247480.1247484"},{"key":"e_1_2_1_53_1","volume-title":"et al. Analyticdb: Real-time olap database system at alibaba cloud. PVLDB, 12(12)","author":"Zhan C.","year":"2019","unstructured":"C. Zhan, M. Su, C. Wei, X. Peng, L. Lin, S. Wang, Z. Chen, F. Li, Y. Pan, F. Zheng, et al. Analyticdb: Real-time olap database system at alibaba cloud. PVLDB, 12(12), 2019."},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357938"},{"issue":"13","key":"e_1_2_1_55_1","first-page":"1393","article-title":"a fault-tolerant resource management and job scheduling system at internet scale","volume":"7","author":"Zhang Z.","year":"2014","unstructured":"Z. Zhang, C. Li, Y. Tao, R. Yang, H. Tang, and J. Xu. Fuxi: a fault-tolerant resource management and job scheduling system at internet scale. PVLDB, 7(13):1393--1404, 2014.","journal-title":"PVLDB"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.5555\/2919332.2919877"},{"key":"e_1_2_1_57_1","volume-title":"Oct.","author":"ZILLIZ.","year":"2019","unstructured":"ZILLIZ. Milvus: an open source vector similarity search engine, Oct. 2019."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3415478.3415541","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,17]],"date-time":"2025-09-17T02:25:45Z","timestamp":1758075945000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3415478.3415541"}},"subtitle":["a hybrid analytical engine towards query fusion for structured and unstructured data"],"short-title":[],"issued":{"date-parts":[[2020,8]]},"references-count":57,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2020,8]]}},"alternative-id":["10.14778\/3415478.3415541"],"URL":"https:\/\/doi.org\/10.14778\/3415478.3415541","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2020,8]]}}}