{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T03:01:24Z","timestamp":1760151684732,"version":"build-2065373602"},"reference-count":22,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2022,3,30]],"date-time":"2022-03-30T00:00:00Z","timestamp":1648598400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Combining topic discovery with topic-specific word embeddings is a popular, powerful method for text mining in a small collection of documents. However, the existing researches purely modeled on the contents of documents and led to discovering noisy topics. This paper proposes a generative model, the skip-gram topical word-embedding model (simplified as steoLC) on asymmetric document link networks, where nodes correspond to documents and links refer to directed references between documents. It simultaneously improves the performance of topic discovery and polysemous word embeddings. Each skip-gram in a document is generated based on the topic distribution of the document and the two word embeddings in the skip-gram. Each directed link is generated based on the hidden topic distribution of the beginning document node. For a document, the skip-grams and links share a common topic distribution. Parameter estimation is inferred and an algorithm is designed to learn the model parameters by combining the expectation-maximization (EM) algorithm with the negative sampling method. Experimental results show that our method generates more useful topic-specific word embeddings and coherent latent topics than the state-of-the-art models.<\/jats:p>","DOI":"10.3390\/sym14040703","type":"journal-article","created":{"date-parts":[[2022,3,30]],"date-time":"2022-03-30T21:24:42Z","timestamp":1648675482000},"page":"703","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A Generative Model for Topic Discovery and Polysemy Embeddings on Directed Attributed Networks"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5865-032X","authenticated-orcid":false,"given":"Bianfang","family":"Chai","sequence":"first","affiliation":[{"name":"Hebei Key Laboratory of Optoelectronic Information and Geo-Detection Technology, Hebei GEO University, Shijiazhuang 050031, China"},{"name":"Information Engineering College, Hebei GEO University, Shijiazhuang 050031, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xinyu","family":"Ji","sequence":"additional","affiliation":[{"name":"Information Engineering College, Hebei GEO University, Shijiazhuang 050031, China"},{"name":"Intelligent Sensor Network Engineering Research Center of Hebei Province, Hebei GEO University, Shijiazhuang 050031, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianglin","family":"Guo","sequence":"additional","affiliation":[{"name":"Information Engineering College, Hebei GEO University, Shijiazhuang 050031, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lixiao","family":"Ma","sequence":"additional","affiliation":[{"name":"Information Engineering College, Hebei GEO University, Shijiazhuang 050031, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yibo","family":"Zheng","sequence":"additional","affiliation":[{"name":"Hebei Key Laboratory of Optoelectronic Information and Geo-Detection Technology, Hebei GEO University, Shijiazhuang 050031, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,3,30]]},"reference":[{"key":"ref_1","unstructured":"Laskey, K.B., and Prade, H. (August, January 30). Probabilistic Latent Semantic Analysis. Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI\u201999), Stockholm, Sweden."},{"key":"ref_2","first-page":"993","article-title":"Latent Dirichlet Allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_3","first-page":"1137","article-title":"A Neural Probabilistic Language Model","volume":"3","author":"Bengio","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Li, S., Chua, T.S., Zhu, J., and Miao, C. (2016, January 7\u201313). Generative Topic Embedding: A Continuous Representation of Documents. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany.","DOI":"10.18653\/v1\/P16-1063"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Das, R., Zaheer, M., and Dyer, C. (2015, January 26\u201331). Gaussian LDA for Topic Models with Word Embeddings. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China.","DOI":"10.3115\/v1\/P15-1077"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1162\/tacl_a_00140","article-title":"Improving Topic Models with Latent Feature Word Representations","volume":"3","author":"Nguyen","year":"2015","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_7","unstructured":"Bonet, B., and Koenig, S. (2015, January 25\u201330). Topical Word Embeddings. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA."},{"key":"ref_8","first-page":"1052","article-title":"Cross-Topic Distributional Semantic Representations Via Unsupervised Mappings","volume":"Volume 1","author":"Burstein","year":"2019","journal-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Xun, G., Li, Y., Gao, J., and Zhang, A. (2017, January 13\u201317). Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts. Proceedings of the 23rd ACM SIGKDD International Conference, Halifax, NS, Canada.","DOI":"10.1145\/3097983.3098009"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1162\/tacl_a_00326","article-title":"A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings","volume":"8","author":"Zhu","year":"2020","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_11","unstructured":"Berger-Wolf, T.Y., and Chawla, N.V. (2019, January 2\u20134). TMSA: A Mutual Learning Model for Topic Discovery and Word Embedding. Proceedings of the 2019 SIAM International Conference on Data Mining, SDM 2019, Calgary, AB, Canada."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Shi, B., Lam, W., Jameel, S., Schockaert, S., and Lai, K.P. (2017, January 7\u201311). Jointly learning word embeddings and latent topics. Proceedings of the SIGIR 2017 International Conference, Tokyo, Japan.","DOI":"10.1145\/3077136.3080806"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"36:1","DOI":"10.1145\/3385415","article-title":"Network Embedding for Community Detection in Attributed Networks","volume":"14","author":"Sun","year":"2020","journal-title":"ACM Trans. Knowl. Discov. Data"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"100286","DOI":"10.1016\/j.cosrev.2020.100286","article-title":"Community detection in node-attributed social networks: A survey","volume":"37","author":"Chunaev","year":"2020","journal-title":"Comput. Sci. Rev."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Bothorel, C., Cruz, J.D., Magnani, M., and Micenkov\u00e1, B. (2015). Clustering Attributed Graphs: Models, Measures and Methods. arXiv.","DOI":"10.1017\/nws.2015.9"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1016\/j.ins.2016.11.028","article-title":"Semi-supervised community detection based on non-negative matrix factorization with node popularity","volume":"381","author":"Liu","year":"2017","journal-title":"Inf. Sci."},{"key":"ref_17","unstructured":"Champin, P., Gandon, F.L., Lalmas, M., and Ipeirotis, P.G. (2018, January 23\u201327). Community detection in Attributed Network. Proceedings of the Companion of the The Web Conference 2018 on The Web Conference 2018, Lyon, France."},{"key":"ref_18","unstructured":"Yang, T., Jin, R., Chi, Y., and Zhu, S. (July, January 28). Combining Link and Content for Community Detection. Proceedings of the Acm Sigkdd International Conference on Knowledge Discovery Data Mining, Paris, France."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Jin, D., Huang, J., Jiao, P., Yang, L., He, D., Fogelman-Souli\u00e9, F., and Huang, Y. (2019, January 13\u201317). A Novel Generative Topic Embedding Model by Introducing Network Communities. Proceedings of the The World Wide Web Conference, San Francisco, CA, USA.","DOI":"10.1145\/3308558.3313623"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Tang, J., Zhang, J., and Yao, L. (2008, January 24\u201327). ArnetMiner: Extraction and Mining of Academic Social Networks. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data, Las Vegas, NV, USA.","DOI":"10.1145\/1401890.1402008"},{"key":"ref_21","first-page":"2579","article-title":"Visualizing Data using t-SNE","volume":"9","author":"Laurens","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"RoDer, M., Both, A., and Hinneburg, A. (2015). Exploring the Space of Topic Coherence Measures, ACM.","DOI":"10.1145\/2684822.2685324"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/4\/703\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:46:31Z","timestamp":1760136391000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/14\/4\/703"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,30]]},"references-count":22,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2022,4]]}},"alternative-id":["sym14040703"],"URL":"https:\/\/doi.org\/10.3390\/sym14040703","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2022,3,30]]}}}