{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T07:38:59Z","timestamp":1776929939825,"version":"3.51.2"},"reference-count":49,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2021,5,21]],"date-time":"2021-05-21T00:00:00Z","timestamp":1621555200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NO Grants 2014-2021","award":["Project contract no. 26\/2020."],"award-info":[{"award-number":["Project contract no. 26\/2020."]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Proteins are essential molecules, that must correctly perform their roles for the good health of living organisms. The majority of proteins operate in complexes and the way they interact has pivotal influence on the proper functioning of such organisms. In this study we address the problem of protein\u2013protein interaction and we propose and investigate a method based on the use of an ensemble of autoencoders. Our approach, entitled AutoPPI, adopts a strategy based on two autoencoders, one for each type of interactions (positive and negative) and we advance three types of neural network architectures for the autoencoders. Experiments were performed on several data sets comprising proteins from four different species. The results indicate good performances of our proposed model, with accuracy and AUC values of over 0.97 in all cases. The best performing model relies on a Siamese architecture in both the encoder and the decoder, which advantageously captures common features in protein pairs. Comparisons with other machine learning techniques applied for the same problem prove that AutoPPI outperforms most of its contenders, for the considered data sets.<\/jats:p>","DOI":"10.3390\/e23060643","type":"journal-article","created":{"date-parts":[[2021,5,21]],"date-time":"2021-05-21T13:15:15Z","timestamp":1621602915000},"page":"643","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["AutoPPI: An Ensemble of Deep Autoencoders for Protein\u2013Protein Interaction Prediction"],"prefix":"10.3390","volume":"23","author":[{"given":"Gabriela","family":"Czibula","sequence":"first","affiliation":[{"name":"Department of Computer Science, Babe\u015f-Bolyai University, 400084 Cluj-Napoca, Romania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexandra-Ioana","family":"Albu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Babe\u015f-Bolyai University, 400084 Cluj-Napoca, Romania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maria Iuliana","family":"Bocicor","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Babe\u015f-Bolyai University, 400084 Cluj-Napoca, Romania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Camelia","family":"Chira","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Babe\u015f-Bolyai University, 400084 Cluj-Napoca, Romania"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,5,21]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"147648","DOI":"10.1155\/2014\/147648","article-title":"Protein-protein interaction detection: Methods and analysis","volume":"2014","author":"Rao","year":"2014","journal-title":"Int. J. Proteom."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"269","DOI":"10.2217\/bmm.13.101","article-title":"Mass spectrometry in cancer biomarker research: A case for immunodepletion of abundant blood-derived proteins from clinical tissue specimens","volume":"8","author":"Prieto","year":"2014","journal-title":"Biomark. Med."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1038\/nature750","article-title":"Comparative assessment of large-scale data sets of protein\u2013protein interactions","volume":"417","author":"Krause","year":"2002","journal-title":"Nature"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"556","DOI":"10.1038\/nature11503","article-title":"Structure-based prediction of protein\u2013protein interactions on a genome-wide scale","volume":"490","author":"Zhang","year":"2012","journal-title":"Nature"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lee, S.A., Chan, C.h., Tsai, C.H., Lai, J.M., Wang, F.S., Kao, C.Y., and Huang, C.Y.F. (2008). Ortholog-based protein-protein interaction prediction and its application to inter-species interactions. BMC Bioinform., 9.","DOI":"10.1186\/1471-2105-9-S12-S11"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1210","DOI":"10.1016\/j.jmb.2013.01.014","article-title":"Understanding protein\u2013protein interactions using local structural features","volume":"425","author":"Bonet","year":"2013","journal-title":"J. Mol. Biol."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Sun, T., Zhou, B., Lai, L., and Pei, J. (2017). Sequence-based prediction of protein protein interaction using a deep-learning algorithm. BMC Bioinform., 18.","DOI":"10.1186\/s12859-017-1700-2"},{"key":"ref_8","first-page":"129","article-title":"Improvement of the mirrortree method by extracting evolutionary information","volume":"21","author":"Sato","year":"2011","journal-title":"Insequence Genome Anal. Method Appl."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1126\/science.285.5428.751","article-title":"Detecting protein function and protein-protein interactions from genome sequences","volume":"285","author":"Marcotte","year":"1999","journal-title":"Science"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Pesquita, C., Faria, D., Falcao, A.O., Lord, P., and Couto, F.M. (2009). Semantic similarity in biomedical ontologies. PLoS Comput. Biol., 5.","DOI":"10.1371\/journal.pcbi.1000443"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1126\/science.1087361","article-title":"A Bayesian networks approach for predicting protein-protein interactions from genomic data","volume":"302","author":"Jansen","year":"2003","journal-title":"Science"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"3025","DOI":"10.1093\/nar\/gkn159","article-title":"Using support vector machine combined with auto covariance to predict protein\u2013protein interactions from protein sequences","volume":"36","author":"Guo","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"4394","DOI":"10.1093\/bioinformatics\/bti721","article-title":"Prediction of protein\u2013protein interactions using random decision forest framework","volume":"21","author":"Chen","year":"2005","journal-title":"Bioinformatics"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Browne, F., Wang, H., Zheng, H., and Azuaje, F. (2007, January 14\u201317). Supervised statistical and machine learning approaches to inferring pairwise and module-based protein interaction networks. Proceedings of the 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering, Boston, MA, USA.","DOI":"10.1109\/BIBE.2007.4375748"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Chen, K.H., Wang, T.F., and Hu, Y.J. (2019). Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme. BMC Bioinform., 20.","DOI":"10.1186\/s12859-019-2907-1"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Bagheri, H., Dyer, R., Severin, A., and Rajan, H. (2021, May 20). Comprehensive Analysis of Non Redundant Protein Database. Res. Sq., Available online: https:\/\/www.researchsquare.com\/article\/rs-54568\/v1.","DOI":"10.21203\/rs.3.rs-54568\/v1"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"11079","DOI":"10.1073\/pnas.0905029106","article-title":"Nature of the protein universe","volume":"106","author":"Levitt","year":"2009","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_18","unstructured":"(2021, March 18). PDB Statistics: Overall Growth of Released Structures Per Year. Available online: https:\/\/www.rcsb.org\/stats\/growth\/growth-released-structures."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"i305","DOI":"10.1093\/bioinformatics\/btz328","article-title":"Multifaceted protein\u2013protein interaction prediction based on Siamese residual RCNN","volume":"35","author":"Chen","year":"2019","journal-title":"Bioinformatics"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"4337","DOI":"10.1073\/pnas.0607879104","article-title":"Predicting protein\u2013protein interactions based only on sequences information","volume":"104","author":"Shen","year":"2007","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_21","first-page":"839","article-title":"Protein Interaction Network Reconstruction Through Ensemble Deep Learning With Attention Mechanism","volume":"8","author":"Li","year":"2020","journal-title":"Front. Bioeng. Biotechnol."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1336","DOI":"10.1039\/C7MB00188F","article-title":"Predicting protein\u2013protein interactions from protein sequences by a stacked sparse autoencoder deep neural network","volume":"13","author":"Wang","year":"2017","journal-title":"Mol. Biosyst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"4216813","DOI":"10.1155\/2018\/4216813","article-title":"Predicting protein interactions using a deep learning method-stacked sparse autoencoder combined with a probabilistic classification vector machine","volume":"2018","author":"Wang","year":"2018","journal-title":"Complexity"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"103964","DOI":"10.1016\/j.compbiomed.2020.103964","article-title":"AE-LGBM: Sequence-based novel approach to detect interacting protein pairs via ensemble of autoencoder and LightGBM","volume":"125","author":"Sharma","year":"2020","journal-title":"Comput. Biol. Med."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Yang, F., Fan, K., Song, D., and Lin, H. (2020). Graph-based prediction of Protein-protein interactions with attributed signed graph embedding. BMC Bioinform., 21.","DOI":"10.1186\/s12859-020-03646-8"},{"key":"ref_26","first-page":"3563","article-title":"What regularized auto-encoders learn from the data-generating distribution","volume":"15","author":"Alain","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_27","unstructured":"Koch, G., Zemel, R., and Salakhutdinov, R. (2021, May 21). Siamese Neural Networks for One-Shot Image Recognition. Available online: https:\/\/www.cs.cmu.edu\/~rsalakhu\/papers\/oneshot1.pdf."},{"key":"ref_28","first-page":"986","article-title":"Learning semantic similarity in a continuous space","volume":"Volume 31","author":"Deudon","year":"2018","journal-title":"Advances in Neural Information Processing Systems"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Utkin, L.V., Zaborovsky, V.S., Lukashin, A.A., Popov, S.G., and Podolskaja, A.V. (2017, January 20\u201322). A siamese autoencoder preserving distances for anomaly detection in multi-robot systems. Proceedings of the 2017 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO), Prague, Czech Republic.","DOI":"10.1109\/ICCAIRO.2017.17"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-14-S8-S10","article-title":"Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis","volume":"Volume 14","author":"You","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2499","DOI":"10.1093\/bioinformatics\/bty140","article-title":"iFeature: A python package and web server for features extraction and selection from protein and peptide sequences","volume":"34","author":"Chen","year":"2018","journal-title":"Bioinformatics"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1016\/j.omtn.2020.08.025","article-title":"Conjoint Feature Representation of GO and Protein Sequence for PPI Prediction Based on an Inception RNN Attention Network","volume":"22","author":"Zhao","year":"2020","journal-title":"Mol. Ther. Nucleic Acids"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Li, H., Gong, X.J., Yu, H., and Zhou, C. (2018). Deep neural network based predictions of protein interactions using primary sequences. Molecules, 23.","DOI":"10.3390\/molecules23081923"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"i802","DOI":"10.1093\/bioinformatics\/bty573","article-title":"Predicting protein\u2013protein interactions through sequence-based deep learning","volume":"34","author":"Hashemifar","year":"2018","journal-title":"Bioinformatics"},{"key":"ref_35","unstructured":"Klambauer, G., Unterthiner, T., Mayr, A., and Hochreiter, S. (2017). Self-normalizing neural networks. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Abadi, M. (2016, January 18\u201324). TensorFlow: Learning functions at scale. Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Nara, Japan.","DOI":"10.1145\/2951913.2976746"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Gu, Q., Zhu, L., and Cai, Z. (2009, January 23\u201325). Evaluation Measures of the Classification Performance of Imbalanced Data Sets. Proceedings of the International Symposium on Intelligence Computation and Applications (ISICA), Huangshi, China.","DOI":"10.1007\/978-3-642-04962-0_53"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1214\/ss\/1009213286","article-title":"Interval Estimation for a proportion","volume":"16","author":"Brown","year":"2001","journal-title":"Stat. Sci."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"4992","DOI":"10.1021\/pr100618t","article-title":"Large-Scale prediction of human protein- protein interactions from amino acid sequence based on latent topic features","volume":"9","author":"Pan","year":"2010","journal-title":"J. Proteome Res."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1756-0500-3-145","article-title":"PRED_PPI: A server for predicting protein-protein interactions based on sequence data with probability assignment","volume":"3","author":"Guo","year":"2010","journal-title":"Bmc Res. Notes"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"887","DOI":"10.1007\/s00726-012-1416-6","article-title":"An empirical study on the matrix-based protein representations and their combination with sequence-based approaches","volume":"44","author":"Nanni","year":"2013","journal-title":"Amino Acids"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/j.neucom.2014.05.072","article-title":"A MapReduce based parallel SVM for large-scale predicting protein\u2013protein interactions","volume":"145","author":"You","year":"2014","journal-title":"Neurocomputing"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/j.jtbi.2011.05.023","article-title":"Adaptive compressive learning for prediction of protein\u2013protein interactions from primary sequence","volume":"283","author":"Zhang","year":"2011","journal-title":"J. Theor. Biol."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"598129","DOI":"10.1155\/2014\/598129","article-title":"Large-scale protein-protein interactions detection by integrating big biosensing data with computational model","volume":"2014","author":"You","year":"2014","journal-title":"Biomed Res. Int."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1142\/S0218339019500013","article-title":"DNN-PPI: A Large-Scale Prediction of Protein\u2013Protein Interactions Based on Deep Neural Networks","volume":"27","author":"Gui","year":"2019","journal-title":"J. Biol. Syst."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"2052012","DOI":"10.1142\/S0218001420520126","article-title":"Using deep neural networks to improve the performance of protein-protein interactions prediction","volume":"34","author":"Gui","year":"2020","journal-title":"Int. J. Pattern Recognit. Artif. Intell."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.mbs.2019.04.002","article-title":"A novel conjoint triad auto covariance (CTAC) coding method for predicting protein-protein interaction based on amino acid sequence","volume":"313","author":"Wang","year":"2019","journal-title":"Math. Biosci."},{"key":"ref_48","unstructured":"Siegel, S., and Castellan, N. (1988). Nonparametric Statistics for the Behavioral Sciences, McGraw-Hill, Inc.. [2nd ed.]."},{"key":"ref_49","unstructured":"(2021, May 20). Social Science Statistics. Available online: http:\/\/www.socscistatistics.com\/tests\/."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/6\/643\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:05:29Z","timestamp":1760162729000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/6\/643"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,5,21]]},"references-count":49,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2021,6]]}},"alternative-id":["e23060643"],"URL":"https:\/\/doi.org\/10.3390\/e23060643","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,5,21]]}}}