{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T19:06:35Z","timestamp":1780081595060,"version":"3.54.0"},"reference-count":39,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2018,2,2]],"date-time":"2018-02-02T00:00:00Z","timestamp":1517529600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004543","name":"China Scholarship Council","doi-asserted-by":"publisher","award":["JIE HU"],"award-info":[{"award-number":["JIE HU"]}],"id":[{"id":"10.13039\/501100004543","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61640209"],"award-info":[{"award-number":["61640209"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51475097"],"award-info":[{"award-number":["51475097"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["51741101"],"award-info":[{"award-number":["51741101"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004001","name":"Science and Technology Foundation of Guizhou Province","doi-asserted-by":"publisher","award":["R[2015]13, JZ[2014]2004 and LH[2016]7433"],"award-info":[{"award-number":["R[2015]13, JZ[2014]2004 and LH[2016]7433"]}],"id":[{"id":"10.13039\/501100004001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Many text mining tasks such as text retrieval, text summarization, and text comparisons depend on the extraction of representative keywords from the main text. Most existing keyword extraction algorithms are based on discrete bag-of-words type of word representation of the text. In this paper, we propose a patent keyword extraction algorithm (PKEA) based on the distributed Skip-gram model for patent classification. We also develop a set of quantitative performance measures for keyword extraction evaluation based on information gain and cross-validation, based on Support Vector Machine (SVM) classification, which are valuable when human-annotated keywords are not available. We used a standard benchmark dataset and a homemade patent dataset to evaluate the performance of PKEA. Our patent dataset includes 2500 patents from five distinct technological fields related to autonomous cars (GPS systems, lidar systems, object recognition systems, radar systems, and vehicle control systems). We compared our method with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF), TextRank and Rapid Automatic Keyword Extraction (RAKE). The experimental results show that our proposed algorithm provides a promising way to extract keywords from patent texts for patent classification.<\/jats:p>","DOI":"10.3390\/e20020104","type":"journal-article","created":{"date-parts":[[2018,2,2]],"date-time":"2018-02-02T06:45:40Z","timestamp":1517553940000},"page":"104","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":88,"title":["Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3893-1594","authenticated-orcid":false,"given":"Jie","family":"Hu","sequence":"first","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology of Ministry of Education, Guizhou University, Guiyang 550025, China"},{"name":"Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4759-6000","authenticated-orcid":false,"given":"Shaobo","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology of Ministry of Education, Guizhou University, Guiyang 550025, China"},{"name":"School of Mechanical Engineering, Guizhou University, Guiyang 550025, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yong","family":"Yao","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, Guizhou University, Guiyang 550025, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Liya","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, Guizhou University, Guiyang 550025, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8761-5195","authenticated-orcid":false,"given":"Guanci","family":"Yang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Advanced Manufacturing Technology of Ministry of Education, Guizhou University, Guiyang 550025, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8725-6660","authenticated-orcid":false,"given":"Jianjun","family":"Hu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA"},{"name":"School of Mechanical Engineering, Guizhou University, Guiyang 550025, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2018,2,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Gerken, J.M., and Moehrle, M.G. (2012). A New Instrument for Technology Monitoring: Novelty in Patents Measured by Semantic Patent Analysis, Springer-Verlag, Inc.","DOI":"10.1007\/s11192-012-0635-7"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.techfore.2017.02.018","article-title":"Application technology opportunity discovery from technology portfolios: Use of patent classification and collaborative filtering","volume":"118","author":"Park","year":"2017","journal-title":"Technol. Forecast. Soc. Chang."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1016\/j.techfore.2016.08.020","article-title":"Monitoring emerging technologies for technology planning using technical keyword based analysis from patent data","volume":"114","author":"Joung","year":"2017","journal-title":"Technol. Forecast. Soc. Chang."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"202","DOI":"10.1016\/j.techfore.2015.03.011","article-title":"Forecasting technology success based on patent data","volume":"96","author":"Altuntas","year":"2015","journal-title":"Technol. Forecast. Soc. Chang."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1007\/s11135-014-0145-1","article-title":"Constructing a weighted keyword-based patent network approach to identify technological trends and evolution in a field of green energy: A case of biofuels","volume":"50","author":"Wu","year":"2016","journal-title":"Qual. Quant."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1016\/j.asoc.2016.01.020","article-title":"A patent quality analysis and classification system using self-organizing maps with support vector machine","volume":"41","author":"Wu","year":"2016","journal-title":"Appl. Soft Comput."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.aei.2011.06.005","article-title":"A patent quality analysis for innovative technology and product development","volume":"26","author":"Trappey","year":"2012","journal-title":"Adv. Eng. Inform."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"883","DOI":"10.1007\/s11192-013-1010-z","article-title":"Identification and evaluation of corporations for merger and acquisition strategies using patent information and text mining","volume":"97","author":"Park","year":"2013","journal-title":"Scientometrics"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/j.wpi.2016.05.008","article-title":"The evolution of patent mining: Applying bibliometrics analysis and keyword network analysis","volume":"46","author":"Madani","year":"2016","journal-title":"World Pat. Inf."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"5200","DOI":"10.1016\/j.eswa.2008.06.131","article-title":"Extracting the significant-rare keywords for patent analysis","volume":"36","author":"Li","year":"2009","journal-title":"Expert Syst. Appl."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1804","DOI":"10.1016\/j.eswa.2007.01.033","article-title":"Visualization of patent analysis for emerging technology","volume":"34","author":"Kim","year":"2008","journal-title":"Expert Syst. Appl."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1007\/s11192-011-0543-2","article-title":"Detecting signals of new technological opportunities using semantic patent analysis and outlier detection","volume":"90","author":"Yoon","year":"2012","journal-title":"Scientometrics"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.wpi.2012.10.005","article-title":"Evaluating the effectiveness of keyword search strategy for patent identification","volume":"35","author":"Xie","year":"2013","journal-title":"World Pat. Inf."},{"key":"ref_14","first-page":"1169","article-title":"Automatic Keyword Extraction from Documents Using Conditional Random Fields","volume":"4","author":"Zhang","year":"2008","journal-title":"J. Comput. Inf. Syst."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Rose, S., Engel, D., Cramer, N., and Cowley, W. (2010). Automatic Keyword Extraction from Individual Documents, John Wiley & Sons, Ltd.","DOI":"10.1002\/9780470689646.ch1"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1016\/j.eswa.2016.03.045","article-title":"Ensemble of keyword extraction methods and classifiers in text classification","volume":"57","author":"Onan","year":"2016","journal-title":"Expert Syst. Appl."},{"key":"ref_17","unstructured":"Medelyan, O., Medelyan, O., Kan, M.Y., and Baldwin, T. (2010, January 15\u201316). SemEval-2010 task 5: Automatic keyphrase extraction from scientific articles. Proceedings of the International Workshop on Semantic Evaluation, Los Angeles, CA, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Wang, R., Liu, W., and Mcdonald, C. (2015, January 4\u20137). Using Word Embeddings to Enhance Keyword Identification for Scientific Publications. Proceedings of the Australasian Database Conference, Melbourne, VIC, Australia.","DOI":"10.1007\/978-3-319-19548-3_21"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Chen, Y., Yin, J., Zhu, W., and Qiu, S. (2015). Novel Word Features for Keyword Extraction, Springer International Publishing.","DOI":"10.1007\/978-3-319-21042-1_12"},{"key":"ref_20","unstructured":"Mikolov, T., Chen, K., Corrado, G., and Dean, J. (arXiv, 2013). Efficient Estimation of Word Representations in Vector Space, arXiv."},{"key":"ref_21","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. (2013, January 5\u20138). Distributed Representations of Words and Phrases and their Compositionality. Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"6007","DOI":"10.3390\/e17096007","article-title":"A Gloss Composition and Context Clustering Based Distributed Word Sense Representation Model","volume":"17","author":"Chen","year":"2015","journal-title":"Entropy"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Ardiansyah, S., Majid, M.A., and Zain, J.M. (2016, January 26\u201327). Knowledge of extraction from trained neural network by using decision tree. Proceedings of the International Conference on Science in Information Technology, Balikpapan, Indonesia.","DOI":"10.1109\/ICSITech.2016.7852637"},{"key":"ref_24","unstructured":"Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., and Nevill-Manning, C.G. (1999, January 11\u201314). KEA: Practical automatic keyphrase extraction. Proceedings of the ACM Conference on Digital Libraries, Berkeley, CA, USA."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Kanis, J. (2016, January 12\u201316). Digging Language Model\u2014Maximum Entropy Phrase Extraction. Proceedings of the International Conference on Text, Speech, and Dialogue, Brno, Czech Republic.","DOI":"10.1007\/978-3-319-45510-5_6"},{"key":"ref_26","unstructured":"Zhou, C., and Li, S. (2010, January 4\u20136). Research of Information Extraction Algorithm based on Hidden Markov Model. Proceedings of the International Conference on Information Science and Engineering, Hangzhou, China."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"917","DOI":"10.1007\/s11859-007-0038-4","article-title":"Keyword Extraction Based on tf\/idf for Chinese News Document","volume":"12","author":"Li","year":"2007","journal-title":"Wuhan Univ. J. Nat. Sci."},{"key":"ref_28","unstructured":"Mihalcea, R., and Tarau, P. (2004, January 25\u201326). TextRank: Bringing Order into Texts. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain."},{"key":"ref_29","first-page":"38","article-title":"Identifying predators of Halyomorpha halys using molecular gut content analysis","volume":"40","author":"Nielsen","year":"2015","journal-title":"J. Inf."},{"key":"ref_30","unstructured":"Rose, S.J., Cowley, W.E., Crow, V.L., and Cramer, N.O. (2012). Rapid Automatic Keyword Extraction for Information Retrieval and Analysis. (8131735 B2), U.S. Patent."},{"key":"ref_31","unstructured":"Wartena, C., Brussee, R., and Slakhorst, W. (September, January 30). Keyword Extraction Using Word Co-occurrence. Proceedings of the Workshops on Database and Expert Systems Applications, Bilbao, Spain."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wartena, C., and Brussee, R. (2008, January 1\u20135). Topic Detection by Clustering Keywords. Proceedings of the International Workshop on Database and Expert Systems Application, Turin, Italy.","DOI":"10.1109\/DEXA.2008.120"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1111\/j.1467-9310.2007.00493.x","article-title":"Morphology analysis for technology roadmapping: Application of text mining","volume":"38","author":"Yoon","year":"2008","journal-title":"R&D Manag."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"564","DOI":"10.1016\/j.cie.2011.12.002","article-title":"Modeling and analyzing technology innovation in the energy sector: Patent-based HMM approach","volume":"63","author":"Lee","year":"2012","journal-title":"Comput. Ind. Eng."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1216","DOI":"10.1016\/j.ipm.2006.11.011","article-title":"Text mining techniques for patent analysis","volume":"43","author":"Tseng","year":"2007","journal-title":"Inf. Process. Manag."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1111\/j.1467-9310.2010.00612.x","article-title":"Identifying technology trends for R&D planning using TRIZ and text mining","volume":"40","author":"Wang","year":"2010","journal-title":"R&D Manag."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"4348","DOI":"10.1016\/j.eswa.2015.01.050","article-title":"Keyword selection and processing strategy for applying text mining to patent analysis","volume":"42","author":"Noh","year":"2015","journal-title":"Expert Syst. Appl."},{"key":"ref_38","first-page":"1137","article-title":"A neural probabilistic language model","volume":"3","author":"Bengio","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., and Manning, C.D. (2014, January 25\u201329). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processin (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1162"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/20\/2\/104\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T14:53:34Z","timestamp":1760194414000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/20\/2\/104"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,2,2]]},"references-count":39,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2018,2]]}},"alternative-id":["e20020104"],"URL":"https:\/\/doi.org\/10.3390\/e20020104","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,2,2]]}}}