{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,6,11]],"date-time":"2024-06-11T05:17:11Z","timestamp":1718083031081},"reference-count":47,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2021,11,24]],"date-time":"2021-11-24T00:00:00Z","timestamp":1637712000000},"content-version":"vor","delay-in-days":327,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,11,22]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>To develop commonsense-grounded NLP applications, a comprehensive and accurate commonsense knowledge graph (CKG) is needed. It is time-consuming to manually construct CKGs and many research efforts have been devoted to the automatic construction of CKGs. Previous approaches focus on generating concepts that have direct and obvious relationships with existing concepts and lack an capability to generate unobvious concepts. In this work, we aim to bridge this gap. We propose a general graph-to-paths pretraining framework that leverages high-order structures in CKGs to capture high-order relationships between concepts. We instantiate this general framework to four special cases: long path, path-to-path, router, and graph-node-path. Experiments on two datasets demonstrate the effectiveness of our methods. The code will be released via the public GitHub repository.<\/jats:p>","DOI":"10.1162\/tacl_a_00426","type":"journal-article","created":{"date-parts":[[2021,11,24]],"date-time":"2021-11-24T18:52:53Z","timestamp":1637779973000},"page":"1268-1284","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":1,"title":["Structured Self-Supervised Pretraining for Commonsense Knowledge Graph Completion"],"prefix":"10.1162","volume":"9","author":[{"given":"Jiayuan","family":"Huang","sequence":"first","affiliation":[{"name":"Zhejiang University, China"}]},{"given":"Yangkai","family":"Du","sequence":"additional","affiliation":[{"name":"Zhejiang University, China"}]},{"given":"Shuting","family":"Tao","sequence":"additional","affiliation":[{"name":"Zhejiang University, China"}]},{"given":"Kun","family":"Xu","sequence":"additional","affiliation":[{"name":"Tencent AI Lab, USA"}]},{"given":"Pengtao","family":"Xie","sequence":"additional","affiliation":[{"name":"UC San Diego, USA. p1xie@eng.ucsd.edu"}]}],"member":"281","published-online":{"date-parts":[[2021,11,22]]},"reference":[{"key":"2021112418524472400_bib1","first-page":"2787","article-title":"Translating embeddings for modeling multi-relational data","volume-title":"Advances in Neural Information Processing Systems","author":"Bordes","year":"2013"},{"key":"2021112418524472400_bib2","doi-asserted-by":"crossref","first-page":"4762","DOI":"10.18653\/v1\/P19-1470","article-title":"Comet: Commonsense transformers for automatic knowledge graph construction","author":"Bosselut","year":"2019","journal-title":"In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics"},{"key":"2021112418524472400_bib3","article-title":"Spectral networks and locally connected networks on graphs","author":"Bruna","year":"2013","journal-title":"arXiv preprint arXiv:1312.6203"},{"key":"2021112418524472400_bib4","article-title":"Harp: Hierarchical representation learning for networks","author":"Chen","year":"2017","journal-title":"arXiv preprint arXiv:1706.07845"},{"key":"2021112418524472400_bib5","first-page":"3844","article-title":"Convolutional neural networks on graphs with fast localized spectral filtering","volume-title":"Advances in Neural Information Processing Systems","author":"Defferrard","year":"2016"},{"key":"2021112418524472400_bib6","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2018","journal-title":"arXiv preprint arXiv: 1810.04805"},{"key":"2021112418524472400_bib7","article-title":"Commonsense knowledge mining from pretrained models","author":"Feldman","year":"2019","journal-title":"In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)"},{"key":"2021112418524472400_bib8","doi-asserted-by":"crossref","DOI":"10.1609\/aaai.v32i1.11977","article-title":"A knowledge- grounded neural conversation model","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Ghazvininejad","year":"2018"},{"key":"2021112418524472400_bib9","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1162\/tacl_a_00302","article-title":"A knowledge- enhanced pretraining model for commonsense story generation","volume":"8","author":"Guan","year":"2020","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2021112418524472400_bib10","first-page":"1024","article-title":"Inductive representation learning on large graphs","volume-title":"Advances in Neural Information Processing Systems","author":"Hamilton","year":"2017"},{"issue":"8","key":"2021112418524472400_bib11","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Computation"},{"key":"2021112418524472400_bib12","article-title":"Strategies for pre-training graph neural networks","author":"Hu","year":"2019","journal-title":"arXiv preprint arXiv: 1905.12265"},{"key":"2021112418524472400_bib13","first-page":"985","article-title":"Knowledge graph completion with adaptive sparse transfer matrix.","volume-title":"AAAI","author":"Ji","year":"2016"},{"key":"2021112418524472400_bib14","article-title":"Adam: A method for stochastic optimization","author":"Kingma","year":"2014","journal-title":"arXiv preprint arXiv:1412.6980"},{"key":"2021112418524472400_bib15","article-title":"Semi- supervised classification with graph convolutional networks","author":"Kipf","year":"2016","journal-title":"arXiv preprint arXiv:1609. 02907"},{"key":"2021112418524472400_bib16","article-title":"Text generation from knowledge graphs with graph transformers","author":"Koncel-Kedziorski","year":"2019","journal-title":"arXiv preprint arXiv:1904.02342"},{"key":"2021112418524472400_bib17","article-title":"ALBERT: A lite BERT for self-supervised learning of language representations","author":"Lan","year":"2019","journal-title":"arXiv preprint arXiv:1909.11942"},{"key":"2021112418524472400_bib18","article-title":"BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension","author":"Lewis","year":"2019","journal-title":"arXiv preprint arXiv:1910.13461"},{"key":"2021112418524472400_bib19","doi-asserted-by":"crossref","first-page":"1445","DOI":"10.18653\/v1\/P16-1137","article-title":"Commonsense knowledge base completion","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Li","year":"2016"},{"key":"2021112418524472400_bib20","article-title":"RoBERTa: A robustly optimized bert pretraining approach","author":"Liu","year":"2019","journal-title":"arXiv preprint arXiv:1907.11692"},{"key":"2021112418524472400_bib21","doi-asserted-by":"crossref","first-page":"1211","DOI":"10.1145\/3038912.3052675","article-title":"Neural network-based question answering over knowledge graphs on word and character level","volume-title":"Proceedings of the 26th international conference on World Wide Web","author":"Lukovnikov","year":"2017"},{"key":"2021112418524472400_bib22","article-title":"Exploiting structural and semantic context for commonsense knowledge base completion","author":"Malaviya","year":"2019","journal-title":"Computing Research Repository, arXiv:1910. 02915"},{"key":"2021112418524472400_bib23","doi-asserted-by":"crossref","DOI":"10.1609\/aaai.v34i03.5684","article-title":"Commonsense knowledge base completion with structural and semantic context","author":"Malaviya","year":"2020","journal-title":"Proceedings of the 34th AAAI Conference on Artificial Intelligence"},{"key":"2021112418524472400_bib24","first-page":"1105","article-title":"Asymmetric transitivity preserving graph embedding","volume-title":"Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining","author":"Mingdong","year":"2016"},{"key":"2021112418524472400_bib25","first-page":"311","article-title":"BLEU: a method for automatic evaluation of machine translation","volume-title":"Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics","author":"Papineni","year":"2002"},{"key":"2021112418524472400_bib26","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1145\/2623330.2623732","article-title":"Deepwalk: Online learning of social representations","volume-title":"Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Perozzi","year":"2014"},{"key":"2021112418524472400_bib27","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1145\/3159652.3159706","article-title":"Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec","volume-title":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","author":"Qiu","year":"2018"},{"key":"2021112418524472400_bib28","article-title":"Improving language understanding by generative pre-training","author":"Radford","year":"2018"},{"key":"2021112418524472400_bib29","article-title":"Language models are unsupervised multitask learners","author":"Radford","year":"2019"},{"key":"2021112418524472400_bib30","doi-asserted-by":"crossref","first-page":"141","DOI":"10.18653\/v1\/K18-1014","article-title":"Commonsense knowledge base completion and generation","volume-title":"Proceedings of the 22nd Conference on Computational Natural Language Learning","author":"Saito","year":"2018"},{"key":"2021112418524472400_bib31","first-page":"3027","article-title":"Atomic: An atlas of machine commonsense for if-then reasoning","volume":"33","author":"Sap","year":"2019"},{"key":"2021112418524472400_bib32","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1007\/978-3-319-93417-4_38","article-title":"Modeling relational data with graph convolutional networks","volume-title":"European Semantic Web Conference","author":"Schlichtkrull","year":"2018"},{"key":"2021112418524472400_bib33","article-title":"Neural machine translation of rare words with subword units","author":"Sennrich","year":"2015","journal-title":"arXiv preprint arXiv:1508.07909"},{"key":"2021112418524472400_bib34","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1007\/978-3-642-35085-6_6","article-title":"Conceptnet 5: A large semantic network for relational knowledge","author":"Speer","year":"2013","journal-title":"The People\u2019s Web Meets NLP"},{"key":"2021112418524472400_bib35","article-title":"Conceptnet 5.5: An open multilingual graph of general knowledge","author":"Speer","year":"2016","journal-title":"arXiv preprint arXiv:1612.03975"},{"key":"2021112418524472400_bib36","first-page":"3104","article-title":"Sequence to sequence learning with neural networks","volume-title":"Advances in Neural Information Processing Systems","author":"Sutskever","year":"2014"},{"key":"2021112418524472400_bib37","article-title":"CommonsenseQA: A question answering challenge targeting commonsense knowledge","author":"Talmor","year":"2018","journal-title":"arXiv preprint arXiv: 1811.00937"},{"key":"2021112418524472400_bib38","doi-asserted-by":"crossref","first-page":"1067","DOI":"10.1145\/2736277.2741093","article-title":"Line: Large-scale information network embedding","volume-title":"Proceedings of the 24th International Conference on World Wide Web","author":"Tang","year":"2015"},{"key":"2021112418524472400_bib39","first-page":"5998","article-title":"Attention is all you need","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani","year":"2017"},{"key":"2021112418524472400_bib40","article-title":"Graph attention networks","author":"Veli\u010dkovi\u0107","year":"2017","journal-title":"arXiv preprint arXiv:1710.10903"},{"key":"2021112418524472400_bib41","first-page":"1112","article-title":"Knowledge graph embedding by translating on hyperplanes.","volume-title":"AAAI","author":"Wang","year":"2014"},{"key":"2021112418524472400_bib42","article-title":"Transg: A generative mixture model for knowledge graph embedding","author":"Xiao","year":"2015","journal-title":"arXiv preprint arXiv:1509.05488"},{"key":"2021112418524472400_bib43","first-page":"4800","article-title":"Hierarchical graph representation learning with differentiable pooling","volume-title":"Advances in Neural Information Processing Systems","author":"Ying","year":"2018"},{"key":"2021112418524472400_bib44","doi-asserted-by":"crossref","DOI":"10.1609\/aaai.v32i1.11923","article-title":"Augmenting end-to-end dialogue systems with commonsense knowledge","volume-title":"Thirty-Second AAAI Conference on Artificial Intelligence","author":"Young","year":"2018"},{"key":"2021112418524472400_bib45","article-title":"Hierarchical graph pooling with structure learning","author":"Zhang","year":"2019","journal-title":"arXiv preprint arXiv:1911.05954"},{"key":"2021112418524472400_bib46","first-page":"4623","article-title":"Commonsense knowledge aware conversation generation with graph attention.","volume-title":"International Joint Conferences on Artificial Intelligence Organization (IJCAI)","author":"Zhou","year":"2018"},{"key":"2021112418524472400_bib47","article-title":"Flexible end-to-end dialogue system for knowledge grounded conversation","author":"Zhu","year":"2017","journal-title":"arXiv preprint arXiv:1709.04264"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00426\/1974759\/tacl_a_00426.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00426\/1974759\/tacl_a_00426.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,16]],"date-time":"2023-01-16T05:26:44Z","timestamp":1673846804000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00426\/108474\/Structured-Self-Supervised-Pretraining-for"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021]]},"references-count":47,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00426","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021]]},"published":{"date-parts":[[2021]]}}}