{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:26:10Z","timestamp":1750220770079,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":28,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,2,19]],"date-time":"2020-02-19T00:00:00Z","timestamp":1582070400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,2,19]]},"DOI":"10.1145\/3385209.3385224","type":"proceedings-article","created":{"date-parts":[[2020,6,7]],"date-time":"2020-06-07T00:35:48Z","timestamp":1591490148000},"page":"51-55","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["An Affinity Propagation Approach for Entity Clustering with Spark"],"prefix":"10.1145","author":[{"given":"Phuc Quang","family":"Tran","sequence":"first","affiliation":[{"name":"Department of Foreign Languages and Informatics, People's Police College III, Cantho, Vietnam"}]},{"given":"Ngoan Thanh","family":"Trieu","sequence":"additional","affiliation":[{"name":"College of Information and Communication Technology, Can Tho University, Cantho, Vietnam"}]},{"given":"Huong Hoang","family":"Luong","sequence":"additional","affiliation":[{"name":"Department of Information Technology, FPT University, Cantho, Vietnam"}]},{"given":"Nghi C.","family":"Tran","sequence":"additional","affiliation":[{"name":"College of Electrical Engineering and Computer Science, National Central University, Taoyuan, Taiwan"}]},{"given":"Hiep Xuan","family":"Huynh","sequence":"additional","affiliation":[{"name":"College of Information and Communication Technology, Can Tho University, Cantho, Vietnam"}]}],"member":"320","published-online":{"date-parts":[[2020,6,6]]},"reference":[{"volume-title":"Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)","author":"Liu Bing","key":"e_1_3_2_1_1_1","unstructured":"Le, Bing Liu. 2006. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications). Springer-Verlag, Berlin, Heidelberg."},{"key":"e_1_3_2_1_2_1","volume-title":"Clustering by passing messages between data points. Science, 315 5814, 972--6","author":"Dueck B.J.","year":"2007","unstructured":"Frey, B.J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315 5814, 972--6."},{"key":"e_1_3_2_1_3_1","unstructured":"Dueck Delbert. (2009). Affinity propagation: Clustering data by passing messages. PhD thesis."},{"key":"e_1_3_2_1_4_1","volume-title":"A Survey of Named Entity Recognition and Classification. Lingvisticae Investigationes. 30. 10.1075\/li.30.1.03nad","author":"Sekine David","year":"2007","unstructured":"Nadeau, David & Sekine, Satoshi. (2007). A Survey of Named Entity Recognition and Classification. Lingvisticae Investigationes. 30. 10.1075\/li.30.1.03nad"},{"key":"e_1_3_2_1_5_1","volume-title":"A Survey on Deep Learning for Named Entity Recognition","author":"Sun Jing","year":"2018","unstructured":"li, Jing & Sun, Aixin & Han, Ray & Li, Chenliang. (2018). A Survey on Deep Learning for Named Entity Recognition."},{"key":"e_1_3_2_1_6_1","volume-title":"A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques","author":"Pouriyeh Mehdi","year":"2017","unstructured":"Allahyari, Mehdi & Pouriyeh, Seyedamin & Assefi, Mehdi & Safaei, Saied & Trippe, Elizabeth & Gutierrez, Juan & Kochut, Krys. (2017). A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques."},{"key":"e_1_3_2_1_7_1","first-page":"10","volume":"23","author":"Tang Tru","year":"2012","unstructured":"Cao, Tru & Tang, Thao & Chau, Cuong. (2012). Text Clustering with Named Entities: A Model, Experimentation and Realization. 23. 10.1007\/978-3-642-23166-710.","journal-title":"Text Clustering with Named Entities: A Model, Experimentation and Realization."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8655(99)00069-0"},{"key":"e_1_3_2_1_9_1","first-page":"12","volume-title":"echnical Report] IAS-UVA-01-02","author":"Likas Aristidis","year":"2001","unstructured":"Aristidis Likas, Nikos Vlassis, Jakob Verbeek. The global k-means clustering algorithm. [Technical Report] IAS-UVA-01-02, 2001, pp.12."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.3115\/974557.974586"},{"key":"e_1_3_2_1_11_1","volume-title":"Clustering Algorithms","author":"Hartigan John A.","unstructured":"John A. Hartigan. 1975. Clustering Algorithms (99th ed.). John Wiley & Sons, Inc., New York, NY, USA.","edition":"99"},{"key":"e_1_3_2_1_12_1","volume-title":"Dubes","author":"Jain Anil K.","year":"1988","unstructured":"Anil K. Jain and Richard C. Dubes. 1988. Algorithms for Clustering Data. Prentice-Hall, Inc., Upper Saddle River, NJ, USA."},{"key":"e_1_3_2_1_13_1","volume-title":"A Neural Probabilistic Language Model. JMLR, 3: 1137--1155","author":"Bengio Y.","year":"2003","unstructured":"Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A Neural Probabilistic Language Model. JMLR, 3: 1137--1155, 2003."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390177"},{"key":"e_1_3_2_1_15_1","volume-title":"ACL","author":"Turian J.","year":"2010","unstructured":"J. Turian, L. Ratinov, and Y. Bengio. Word representations: a simple and general method for semi-supervised learning. In ACL, 2010."},{"key":"e_1_3_2_1_16_1","volume-title":"Efficient Estimation of Word Representations in Vector Space. CoRR, abs\/1301.3781","author":"Dean G.S.","year":"2013","unstructured":"Mikolov, T., Chen, K., Corrado, G.S., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. CoRR, abs\/1301.3781."},{"key":"e_1_3_2_1_17_1","volume-title":"Proceedings of the 32nd International Conference on International Conference on Machine Learning -","volume":"37","author":"Kusner Matt J.","unstructured":"Matt J. Kusner, Yu Sun, Nicholas I. Kolkin, and Kilian Q. Weinberger. 2015. From word embeddings to document distances. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML '15), Francis Bach and David Blei (Eds.), Vol. 37. JMLR.org 957--966"},{"key":"e_1_3_2_1_18_1","volume-title":"Proceedings of the 26th International Conference on Neural Information Processing Systems -","volume":"2","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'13), C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.), Vol. 2. Curran Associates Inc., USA, 3111--3119."},{"key":"e_1_3_2_1_19_1","volume-title":"Masters thesis","author":"Mikolov T.","year":"2007","unstructured":"T. Mikolov. Language Modeling for Speech Recognition in Czech, Masters thesis, Brno University of Technology, 2007."},{"key":"e_1_3_2_1_20_1","volume-title":"Proc. ICASSP","author":"Mikolov T.","year":"2009","unstructured":"T. Mikolov, J. Kopecky, L. Burget, O. Glembek and J. Cernocky. Neural network based language models for higly inflective languages, In: Proc. ICASSP 2009."},{"key":"e_1_3_2_1_21_1","volume-title":"International Journal of Machine Learning and Cybernetics. 1. 43--52. 10.1007\/s13042-010-0001-0","author":"Jin Yin","year":"2010","unstructured":"Zhang, Yin & Jin, Rong & Zhou, Zhi-Hua. (2010). Understanding bagof-words model: A statistical framework. International Journal of Machine Learning and Cybernetics. 1. 43--52. 10.1007\/s13042-010-0001-0"},{"key":"e_1_3_2_1_22_1","series-title":"Lecture Notes in Computer Science","volume-title":"Sampling Strategies for Bag-of-Features Image Classification. Computer Vision - ECCV","author":"Jurie Eric","year":"2006","unstructured":"Nowak, Eric & Jurie, Frederic & Triggs, Bill. (2006). Sampling Strategies for Bag-of-Features Image Classification. Computer Vision - ECCV 2006, Volume 3954 of Lecture Notes in Computer Science. 3954. 490--503. 10.1007\/11744085_38."},{"volume-title":"Using TF-IDF to determine word relevance indocument queries","year":"2003","key":"e_1_3_2_1_23_1","unstructured":"Ramos, Juan. (2003). Using TF-IDF to determine word relevance indocument queries."},{"key":"e_1_3_2_1_24_1","unstructured":"\"Apache Spark - Lightning-Fast Cluster Computing.\" [Online]. Available: http:\/\/spark.apache.org\/. [Accessed: 12-Jun- 2019]"},{"key":"e_1_3_2_1_25_1","first-page":"10","volume-title":"Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing","author":"Zaharia M.","year":"2010","unstructured":"M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster Computing with Working Sets. In Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, Berkeley, CA, USA, 2010, pp. 10--10."},{"key":"e_1_3_2_1_26_1","unstructured":"\"Apache Hadoop\" [Online]. Available: http:\/\/hadoop.apache.org\/. [Accessed: 12-Jul- 2019]."},{"key":"e_1_3_2_1_27_1","volume-title":"WWW","author":"McAuley J.","year":"2013","unstructured":"J. McAuley and J. Leskovec. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. WWW, 2013."},{"key":"e_1_3_2_1_28_1","unstructured":"\"Paper Reviews Data Set \" [Online]. Available: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Paper+Reviews. [Accessed: 12-Sep- 2019]."}],"event":{"name":"ICIIT 2020: 2020 5th International Conference on Intelligent Information Technology","acronym":"ICIIT 2020","location":"Hanoi Viet Nam"},"container-title":["Proceedings of the 2020 5th International Conference on Intelligent Information Technology"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3385209.3385224","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3385209.3385224","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:41:14Z","timestamp":1750200074000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3385209.3385224"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,19]]},"references-count":28,"alternative-id":["10.1145\/3385209.3385224","10.1145\/3385209"],"URL":"https:\/\/doi.org\/10.1145\/3385209.3385224","relation":{},"subject":[],"published":{"date-parts":[[2020,2,19]]},"assertion":[{"value":"2020-06-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}