{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T12:23:58Z","timestamp":1773923038250,"version":"3.50.1"},"reference-count":64,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2016,8,29]],"date-time":"2016-08-29T00:00:00Z","timestamp":1472428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>We apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from shortest path walks over integrated syntactic graphs and apply them to determine the authors of documents. On average, our method outperforms the state of the art approaches and gives consistently high results across different corpora, unlike existing methods. Our results show that our textual patterns are useful for the task of authorship attribution.<\/jats:p>","DOI":"10.3390\/s16091374","type":"journal-article","created":{"date-parts":[[2016,8,29]],"date-time":"2016-08-29T10:18:38Z","timestamp":1472465918000},"page":"1374","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs"],"prefix":"10.3390","volume":"16","author":[{"given":"Helena","family":"G\u00f3mez-Adorno","sequence":"first","affiliation":[{"name":"Instituto Polit\u00e9cnico Nacional, Centro de Investigaci\u00f3n en Computaci\u00f3n, Av. Juan de Dios B\u00e1tiz S\/N, Mexico City 07738, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Grigori","family":"Sidorov","sequence":"additional","affiliation":[{"name":"Instituto Polit\u00e9cnico Nacional, Centro de Investigaci\u00f3n en Computaci\u00f3n, Av. Juan de Dios B\u00e1tiz S\/N, Mexico City 07738, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Pinto","sequence":"additional","affiliation":[{"name":"Benem\u00e9rita Universidad Aut\u00f3noma de Puebla, Facultad de Ciencias de la Computaci\u00f3n, Av. San Claudio y 14 Sur, Puebla 72570, Mexico,"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Darnes","family":"Vilari\u00f1o","sequence":"additional","affiliation":[{"name":"Benem\u00e9rita Universidad Aut\u00f3noma de Puebla, Facultad de Ciencias de la Computaci\u00f3n, Av. San Claudio y 14 Sur, Puebla 72570, Mexico,"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander","family":"Gelbukh","sequence":"additional","affiliation":[{"name":"Instituto Polit\u00e9cnico Nacional, Centro de Investigaci\u00f3n en Computaci\u00f3n, Av. Juan de Dios B\u00e1tiz S\/N, Mexico City 07738, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2016,8,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Mihalcea, R., and Radev, D. (2011). Graph-Based Natural Language Processing and Information Retrieval, MIT Press.","DOI":"10.1017\/CBO9780511976247"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1016\/j.patrec.2013.12.004","article-title":"A graph-based multi-level linguistic representation for document understanding","volume":"41","author":"Pinto","year":"2014","journal-title":"Pattern Recognit. Lett."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1016\/j.eswa.2013.08.015","article-title":"Syntactic N-grams as machine learning features for natural language processing","volume":"41","author":"Sidorov","year":"2013","journal-title":"Expert Syst. Appl."},{"key":"ref_4","unstructured":"Salton, G. (1988). Automatic Text Processing, Addison-Wesley Longman Publishing Co., Inc."},{"key":"ref_5","first-page":"357","article-title":"Segmentation strategies to face morphology challenges in Brazilian-Portuguese\/English statistical machine translation and its integration in cross-language information retrieval","volume":"19","year":"2015","journal-title":"Comput. Sist."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"23","DOI":"10.17562\/PB-43-3","article-title":"Semantic textual entailment recognition using UNL","volume":"43","author":"Pakray","year":"2011","journal-title":"Polibits"},{"key":"ref_7","unstructured":"Vilarino, D., Pinto, D., Le\u00f3n, S., Alem\u00e1n, Y., and G\u00f3mez-Adorno, H. (2013, January 14\u201315). BUAP: N-gram based Feature Evaluation for the Cross-Lingual Textual Entailment Task. Proceedings of the 7th International Workshop on Semantic Evaluation, Atlanta, GA, USA."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Pakray, P., Pal, S., Poria, S., Bandyopadhyay, S., and Gelbukh, A. (2011, January 14\u201315). A textual entailment system using anaphora resolution. Proceedings of the Text Analysis Conference on Recognizing Textual Entailment Track, Gaithersburg, MD, USA.","DOI":"10.1109\/ICACTE.2010.5579163"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1109\/MCI.2015.2471215","article-title":"Sentiment data flow analysis by means of dynamic linguistic patterns","volume":"10","author":"Poria","year":"2015","journal-title":"IEEE Comput. Intell. Mag."},{"key":"ref_10","unstructured":"Mladenic, D., and Grobelnik, M. (1998, January 24\u201326). Word sequences as features in text-learning. Proceedings of the 17th Electrotechnical and Computer Science Conference, Ljubljana, Slovenia."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1023\/A:1002681919510","article-title":"Computer-based authorship attribution without lexical measures","volume":"35","author":"Stamatatos","year":"2001","journal-title":"Comput. Hum."},{"key":"ref_12","unstructured":"Keselj, V., Peng, F., Cercone, N., and Thomas, C. (2003, January 22\u201325). N-gram-based author profiles for authorship attribution. Proceedings of the Conference Pacific Association for Computational Linguistics, Halifax, NS, Canada."},{"key":"ref_13","unstructured":"Markov, I., G\u00f3mez-Adorno, H., Sidorov, G., and Gelbukh, A. (2016, January 5\u20138). Adapting cross-genre author profiling to language and corpus. Proceedings of the CLEF 2016 Evaluation Labs, \u00c9vora, Portugal."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Mauldin, M.L. (1991, January 13\u201316). Retrieval performance in ferret a conceptual information retrieval system. Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Chicago, IL, USA.","DOI":"10.1145\/122860.122896"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Croft, W.B., Turtle, H.R., and Lewis, D.D. (1991, January 13\u201316). The use of phrases and structured queries in information retrieval. Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Chicago, IL, USA.","DOI":"10.1145\/122860.122864"},{"key":"ref_16","first-page":"169","article-title":"Syntactic dependency based n-grams in rule based automatic English as second language grammar correction","volume":"4","author":"Sidorov","year":"2013","journal-title":"Int. J. Computer Appl."},{"key":"ref_17","unstructured":"Posadas-Dur\u00e1n, J.P., Markov, I., G\u00f3mez-Adorno, H., Sidorov, G., Batyrshin, I., Gelbukh, A., and Pichardo-Lagunas, O. (2015, January 8\u201311). Syntactic n-grams as features for the author profiling task. Proceedings of the CLEF 2015 Evaluation Labs, Toulouse, France."},{"key":"ref_18","unstructured":"Sidorov, G., G\u00f3mez-Adorno, H., Markov, I., Pinto, D., and Loya, N. (2015, January 17\u201319). Syntactic n-grams as features for the author profiling task. Proceedings of the Annual Conference of the North American Fuzzy Information processing Society and 5th World Conference on Soft Computing, Redmond, WA, USA."},{"key":"ref_19","unstructured":"Cambria, E., Poria, S., Gelbukh, A., and Kwok, K. (2014, January 7\u201311). A common-sense based api for concept-level sentiment analysis. Proceedings of the 4th Workshop on Making Sense of Microposts, Seoul, Korea."},{"key":"ref_20","unstructured":"Poria, S., Gelbukh, A., Das, D., and Bandyopadhyay, S. (November, January 27). Fuzzy clustering for semi-supervised learning\u2014Case study: Construction of an emotion lexicon. Proceedings of the Mexican International Conference on Artificial Intelligence, San Luis Potosi, Mexico."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zha, H. (2002, January 11\u201315). Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland.","DOI":"10.1145\/564376.564398"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Nicolae, C., and Nicolae, G. (2006, January 22\u201323). BestCut: A graph algorithm for coreference resolution. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Sydney, Australia.","DOI":"10.3115\/1610075.1610115"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Dorow, B., and Widdows, D. (2003, January 12\u201317). Discovering corpus-specific word senses. Proceedings of the 10th Conference on European Chapter of the Association for Computational Linguistics, Budapest, Hungary.","DOI":"10.3115\/1067737.1067753"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1016\/j.csl.2004.05.002","article-title":"HyperLex: Lexical cartography for information retrieval","volume":"18","year":"2004","journal-title":"Comput. Speech Lang."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Agirre, E., Mart\u00ednez, D., de Lacalle, O.L., and Soroa, A. (2006, January 22\u201323). Two graph-based algorithms for state-of-the-art WSD. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia.","DOI":"10.3115\/1610075.1610157"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Matsuo, Y., Sakaki, T., Uchiyama, K., and Ishizuka, M. (2006, January 22\u201323). Graph-based word clustering using a web search engine. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia.","DOI":"10.3115\/1610075.1610150"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Biemann, C. (2006, January 9). Chinese Whispers: An efficient graph clustering algorithm and its application to natural language processing problems. Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, New York, NY, USA.","DOI":"10.3115\/1654758.1654774"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1007\/s10115-004-0194-1","article-title":"Generative model-based document clustering: A comparative study","volume":"8","author":"Zhong","year":"2005","journal-title":"Knowl. Inf. Syst."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Jin, W., and Srihari, R.K. (2007, January 11\u201315). Graph-based text representation and knowledge discovery. Proceedings of the 2007 ACM Symposium on Applied Computing, Seoul, Korea.","DOI":"10.1145\/1244002.1244182"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zhou, F., Zhang, F., and Yang, B. (2010, January 21\u201323). Graph-based text representation model and its realization. Proceedings of the 2010 International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China.","DOI":"10.1109\/NLPKE.2010.5587861"},{"key":"ref_31","unstructured":"Kiros, R., Zemel, R.S., and Salakhutdinov, R.R. (2014, January 8\u201313). A multiplicative model for learning distributed text-based attribute representations. Proceedings of the Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_32","unstructured":"Rhodes, D. Author Attribution with CNNs. Available online: https:\/\/www.semanticscholar.org\/paper\/Author-Attribution-with-Cnn-s-Rhodes\/0a904f9d6b47dfc574f681f4d3b41bd840871b6f\/pdf."},{"key":"ref_33","unstructured":"Juola, P. (2012, January 17\u201320). An overview of the traditional authorship attribution subtask. Proceedings of the CLEF 2012 Evaluation Labs and Workshop\u2014Working Notes Papers, Rome, Italy."},{"key":"ref_34","unstructured":"Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient Estimation of Word Representations in Vector Space. Available online: http:\/\/arxiv.org\/abs\/1301.3781."},{"key":"ref_35","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. (2013, January 5\u20138). Distributed representations of words and phrases and their pompositionality. Proceedings of the Advances in Neural Information Processing Systems: 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liu, H., and Xu, C. (2011). Can syntactic networks indicate morphological complexity of a language?. Europhys. Lett., 93.","DOI":"10.1209\/0295-5075\/93\/28005"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1139","DOI":"10.1007\/s11434-013-5711-8","article-title":"Language clustering with word co-occurrence networks based on parallel texts","volume":"58","author":"Liu","year":"2013","journal-title":"Chin. Sci. Bull."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"598","DOI":"10.1016\/j.plrev.2014.04.004","article-title":"Approaching human language with complex networks","volume":"11","author":"Cong","year":"2014","journal-title":"Phys. Life Rev."},{"key":"ref_39","first-page":"1","article-title":"Empirical characterization of modern chinese as a multi-level system from the complex network approach","volume":"1","author":"Liu","year":"2014","journal-title":"J. Chin. Linguist."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1007\/BF01263048","article-title":"Understanding semantic relationships","volume":"2","author":"Storey","year":"1993","journal-title":"VLDB J."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Bejar, I., Chaffin, R., and Embretson, S. (1991). Cognitive and Psychometric Analysis of Analogical Problem Solving, Springer Science and Business Media.","DOI":"10.1007\/978-1-4613-9690-1"},{"key":"ref_42","unstructured":"Nugues, P.M. (2006). An Introduction to Language Processing with Perl and Prolog: An Putline of Theories, Implementation, and Application with Special Consideration of English, French, and German, Springer-Verlag Berlin Heidelberg."},{"key":"ref_43","unstructured":"Socher, R., Bauer, J., Manning, C.D., and Ng, A.Y. (2013, January 4\u20139). Parsing with compositional vector grammars. Proceedings of ACL Conference 2013, Sofia, Bulgaria."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"538","DOI":"10.1002\/asi.21001","article-title":"A survey of modern authorship attribution methods","volume":"60","author":"Stamatatos","year":"2009","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1145\/219717.219748","article-title":"WordNet: A lexical database for english","volume":"38","author":"Miller","year":"1995","journal-title":"Commun. ACM"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1007\/BF01386390","article-title":"A note on two problems in connexion with graphs","volume":"1","author":"Dijkstra","year":"1959","journal-title":"Numerische Math."},{"key":"ref_47","unstructured":"G\u00f3mez-Adorno, H.M. (2013). Una Metodolog\u00eda Para el Desarrollo de Sistemas de b\u00fasqueda de Respuestas Orientados a Pruebas de Lectura Comprensiva. [Master\u2019s Thesis, Benem\u00e9rita Universidad Aut\u00f3noma de Puebla]."},{"key":"ref_48","first-page":"491","article-title":"Soft similarity and soft cosine measure: Similarity of features in vector space model","volume":"18","author":"Sidorov","year":"2014","journal-title":"Comput. Sist."},{"key":"ref_49","first-page":"1261","article-title":"Measuring differentiability: Unmasking pseudonymous authors","volume":"8","author":"Koppel","year":"2007","journal-title":"J. Mach. Learn. Res."},{"key":"ref_50","unstructured":"Joula, P., and Stamatatos, E. Overview of the Author Identification Task at PAN 2013. Available online: http:\/\/ceur-ws.org\/Vol-1179\/CLEF2013wn-PAN-JuolaEt2013.pdf."},{"key":"ref_51","unstructured":"Stamatatos, E., Daelemans, W., Verhoeven, B., Potthast, M., Stein, B., Juola, P., Sanchez-Perez, M.A., and Barr\u00f3n-Cede\u00f1o, A. (2014, January 15\u201318). Overview of the author identification task at PAN 2014. Proceedings of the Working Notes for CLEF 2014 Conference, Sheffield, UK."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Potthast, M., Braun, S., Buz, T., Duffhauss, F., Friedrich, F., G\u00fclzow, J.M., K\u00f6hler, J., L\u00f6tzsch, W., M\u00fcller, F., and M\u00fcller, M.E. (2016, January 21\u201323). Who wrote the web? Revisiting influential author identification research applicable to information retrieval. Proceedings of the Advances in Information Retrieval: 38th European Conference on IR Research, Padua, Italy.","DOI":"10.1007\/978-3-319-30671-1_29"},{"key":"ref_53","unstructured":"Plakias, S., and Stamatatos, E. (2008, January 2\u20134). Tensor space models for authorship identification. Proceedings of the Hellenic Conference on Artificial Intelligence, Syros, Greece."},{"key":"ref_54","unstructured":"Plakias, S., and Stamatatos, E. (2008, January 21\u201325). Author identification using a tensor space representation. Proceedings of the 18th European Conference on Artificial Intelligence, Patras, Greece."},{"key":"ref_55","first-page":"361","article-title":"RCV1: A new benchmark collection for text categorization research","volume":"5","author":"Lewis","year":"2004","journal-title":"J. Mach. Learn. Res."},{"key":"ref_56","unstructured":"Vilari\u00f1o, D., Pinto, D., G\u00f3mez, H., Le\u00f3n, S., and Castillo, E. Lexical-Syntactic and Graph-Based Features for Authorship Verification. Available online: https:\/\/www.semanticscholar.org\/paper\/Lexical-Syntactic-and-Graph-Based-Features-for-Ayala-Pinto\/cd49fb977509fc256bc03d2e1d2443d64d41cb7c\/pdf."},{"key":"ref_57","unstructured":"Escalante, H.J., Solorio, T., and Montes-y G\u00f3mez, M. (2011, January 19\u201324). Local histograms of character n-grams for authorship attribution. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA."},{"key":"ref_58","unstructured":"Seidman, S. Authorship Verification Using the Impostors Method. Available online: http:\/\/citeseerx.ist.psu.edu\/viewdoc\/download?doi=10.1.1.667.4579rep=rep1type=pdf."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1002\/asi.20961","article-title":"Computational methods in authorship attribution","volume":"60","author":"Koppel","year":"2009","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"ref_60","unstructured":"Noreen, E.W. (1989). Computer Intensive Methods for Testing Hypotheses: An introduction, Wiley-Interscience."},{"key":"ref_61","unstructured":"Largeron, C., Frery, J., and Juganaru Mathieu, M. (2014, January 15\u201318). UJM at CLEF in author verification based on optimized classification trees. Notebook for PAN at CLEF 2014. In Proceedings of the Conference and Labs of the Evaluation Forum Information Access Evaluation meets Multilinguality, Multimodality, and Interaction, Sheffield, UK."},{"key":"ref_62","unstructured":"Modaresi, P., and Gross, P. (2014, January 15\u201318). A language independent author verifier using fuzzy C-Means clustering. Proceedings of the Working Notes for CLEF 2014 Conference, Sheffield, UK."},{"key":"ref_63","unstructured":"Raghavan, S., Kovashka, A., and Mooney, R. (2010, January 11\u201316). Authorship attribution using probabilistic context-free grammars. Proceedings of the ACL 2010 Conference Short Papers, Uppsala, Sweden."},{"key":"ref_64","unstructured":"Stamatatos, E., Daelemans, W., Verhoeven, B., Juola, P., L\u00f3pez-L\u00f3pez, A., Potthast, M., and Stein, B. (2015, January 8\u201311). Overview of the author identification task at PAN 2015. Proceedings of the 2015 Conference and Labs of the Evaluation Forum, Toulouse, France."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/16\/9\/1374\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:29:26Z","timestamp":1760210966000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/16\/9\/1374"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,8,29]]},"references-count":64,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2016,9]]}},"alternative-id":["s16091374"],"URL":"https:\/\/doi.org\/10.3390\/s16091374","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,8,29]]}}}