{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,26]],"date-time":"2025-10-26T15:05:07Z","timestamp":1761491107054,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":36,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T00:00:00Z","timestamp":1603065600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,19]]},"DOI":"10.1145\/3340531.3412029","type":"proceedings-article","created":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T06:32:45Z","timestamp":1603089165000},"page":"1535-1544","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Mining Infrequent High-Quality Phrases from Domain-Specific Corpora"],"prefix":"10.1145","author":[{"given":"Li","family":"Wang","sequence":"first","affiliation":[{"name":"Fudan University, Shanghai, China"}]},{"given":"Wei","family":"Zhu","sequence":"additional","affiliation":[{"name":"Pingan Health Technology, Shanghai, China"}]},{"given":"Sihang","family":"Jiang","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}]},{"given":"Sheng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Pingan Health Technology, Shanghai, China"}]},{"given":"Keqiang","family":"Wang","sequence":"additional","affiliation":[{"name":"Pingan Health Technology, Shanghai, China"}]},{"given":"Yuan","family":"Ni","sequence":"additional","affiliation":[{"name":"Pingan Health Technology, Shanghai, China"}]},{"given":"Guotong","family":"Xie","sequence":"additional","affiliation":[{"name":"Pingan Health Technology, Shanghai, China"}]},{"given":"Yanghua","family":"Xiao","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}]}],"member":"320","published-online":{"date-parts":[[2020,10,19]]},"reference":[{"key":"e_1_3_2_2_1_1","unstructured":"Armen Allahverdyan and Aram Galstyan. 2011. Comparative analysis of viterbi training and maximum likelihood estimation for hmms. In Advances in Neural Information Processing Systems. 1674--1682.  Armen Allahverdyan and Aram Galstyan. 2011. Comparative analysis of viterbi training and maximum likelihood estimation for hmms. In Advances in Neural Information Processing Systems. 1674--1682."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1921007"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.3115\/981732.981764"},{"key":"e_1_3_2_2_4_1","volume-title":"Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio.","author":"Cho Kyunghyun","year":"2014","unstructured":"Kyunghyun Cho , Bart Van Merri\u00ebnboer , Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014 . Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014). Kyunghyun Cho, Bart Van Merri\u00ebnboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)."},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611973440.46"},{"key":"e_1_3_2_2_6_1","unstructured":"P Deepak Atreyee Dey and Debapriyo Majumdar. 2014. Fast Mining of Interesting Phrases from Subsets of Text Corpora. In EDBT. 193--204.  P Deepak Atreyee Dey and Debapriyo Majumdar. 2014. Fast Mining of Interesting Phrases from Subsets of Text Corpora. In EDBT. 193--204."},{"key":"e_1_3_2_2_7_1","volume-title":"Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/2735508.2735519"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-349-27748-3"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/s007999900023"},{"key":"e_1_3_2_2_11_1","volume-title":"Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics","author":"Fukushima Kunihiko","year":"1980","unstructured":"Kunihiko Fukushima . 1980 . Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics , Vol. 36 , 4 (1980), 193--202. Kunihiko Fukushima. 1980. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, Vol. 36, 4 (1980), 193--202."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2247596.2247628"},{"key":"e_1_3_2_2_13_1","volume-title":"Extremely randomized trees. Machine learning","author":"Geurts Pierre","year":"2006","unstructured":"Pierre Geurts , Damien Ernst , and Louis Wehenkel . 2006. Extremely randomized trees. Machine learning , Vol. 63 , 1 ( 2006 ), 3--42. Pierre Geurts, Damien Ernst, and Louis Wehenkel. 2006. Extremely randomized trees. Machine learning, Vol. 63, 1 (2006), 3--42."},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2005.1556215"},{"key":"e_1_3_2_2_15_1","volume-title":"Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 365--373","author":"Hasan Kazi Saidul","year":"2010","unstructured":"Kazi Saidul Hasan and Vincent Ng . 2010 . Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art . In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 365--373 . Kazi Saidul Hasan and Vincent Ng. 2010. Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 365--373."},{"key":"e_1_3_2_2_16_1","volume-title":"Long short-term memory. Neural computation","author":"Hochreiter Sepp","year":"1997","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber . 1997. Long short-term memory. Neural computation , Vol. 9 , 8 ( 1997 ), 1735--1780. Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780."},{"key":"e_1_3_2_2_17_1","volume-title":"Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12","author":"Kingma Diederik","year":"2014","unstructured":"Diederik Kingma and Jimmy Ba . 2014 . Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014). Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014)."},{"key":"e_1_3_2_2_18_1","unstructured":"John Lafferty Andrew McCallum and Fernando CN Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. (2001).  John Lafferty Andrew McCallum and Fernando CN Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. (2001)."},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.5555\/3298023.3298073"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2018.2823758"},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2018.00042"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2751523"},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33019678"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1054"},{"key":"e_1_3_2_2_25_1","volume-title":"Proceedings of the 2004 conference on empirical methods in natural language processing.","author":"Mihalcea Rada","year":"2004","unstructured":"Rada Mihalcea and Paul Tarau . 2004 . Textrank: Bringing order into text . In Proceedings of the 2004 conference on empirical methods in natural language processing. Rada Mihalcea and Paul Tarau. 2004. Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing."},{"key":"e_1_3_2_2_26_1","unstructured":"Tomas Mikolov Ilya Sutskever Kai Chen Greg S Corrado and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.  Tomas Mikolov Ilya Sutskever Kai Chen Greg S Corrado and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2018.06.004"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920914"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.3115\/1072228.1072370"},{"key":"e_1_3_2_2_30_1","unstructured":"Vasin Punyakanok and Dan Roth. 2001. The use of classifiers in sequential inference. In Advances in Neural Information Processing Systems. 995--1001.  Vasin Punyakanok and Dan Roth. 2001. The use of classifiers in sequential inference. In Advances in Neural Information Processing Systems. 995--1001."},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2017.05.014"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052708"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2018.2812203"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"crossref","unstructured":"Jingbo Shang Liyuan Liu Xiang Ren Xiaotao Gu Teng Ren and Jiawei Han. 2018b. Learning Named Entity Tagger using Domain-Specific Dictionary. In EMNLP.  Jingbo Shang Liyuan Liu Xiang Ren Xiaotao Gu Teng Ren and Jiawei Han. 2018b. Learning Named Entity Tagger using Domain-Specific Dictionary. In EMNLP.","DOI":"10.18653\/v1\/D18-1230"},{"key":"e_1_3_2_2_35_1","volume-title":"Kea: Practical automated keyphrase extraction. In Design and Usability of Digital Libraries: Case Studies in the Asia Pacific. IGI Global, 129--152.","author":"Witten Ian H","year":"2005","unstructured":"Ian H Witten , Gordon W Paynter , Eibe Frank , Carl Gutwin , and Craig G Nevill-Manning . 2005 . Kea: Practical automated keyphrase extraction. In Design and Usability of Digital Libraries: Case Studies in the Asia Pacific. IGI Global, 129--152. Ian H Witten, Gordon W Paynter, Eibe Frank, Carl Gutwin, and Craig G Nevill-Manning. 2005. Kea: Practical automated keyphrase extraction. In Design and Usability of Digital Libraries: Case Studies in the Asia Pacific. IGI Global, 129--152."},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075218.1075233"}],"event":{"name":"CIKM '20: The 29th ACM International Conference on Information and Knowledge Management","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Virtual Event Ireland","acronym":"CIKM '20"},"container-title":["Proceedings of the 29th ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3340531.3412029","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3340531.3412029","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:29Z","timestamp":1750197749000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3340531.3412029"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,19]]},"references-count":36,"alternative-id":["10.1145\/3340531.3412029","10.1145\/3340531"],"URL":"https:\/\/doi.org\/10.1145\/3340531.3412029","relation":{},"subject":[],"published":{"date-parts":[[2020,10,19]]},"assertion":[{"value":"2020-10-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}