{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:05:20Z","timestamp":1760231120603,"version":"build-2065373602"},"reference-count":34,"publisher":"MDPI AG","issue":"17","license":[{"start":{"date-parts":[[2022,8,31]],"date-time":"2022-08-31T00:00:00Z","timestamp":1661904000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the Korea Environment Industry &amp; Technology Institute","award":["2021003380003"],"award-info":[{"award-number":["2021003380003"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Several pathogens that spread through the air are highly contagious, and related infectious diseases are more easily transmitted through airborne transmission under indoor conditions, as observed during the COVID-19 pandemic. Indoor air contaminated by microorganisms, including viruses, bacteria, and fungi, or by derived pathogenic substances, can endanger human health. Thus, identifying and analyzing the potential pathogens residing in the air are crucial to preventing disease and maintaining indoor air quality. Here, we applied deep learning technology to analyze and predict the toxicity of bacteria in indoor air. We trained the ProtBert model on toxic bacterial and virulence factor proteins and applied them to predict the potential toxicity of some bacterial species by analyzing their protein sequences. The results reflect the results of the in vitro analysis of their toxicity in human cells. The in silico-based simulation and the obtained results demonstrated that it is plausible to find possible toxic sequences in unknown protein sequences.<\/jats:p>","DOI":"10.3390\/s22176557","type":"journal-article","created":{"date-parts":[[2022,8,31]],"date-time":"2022-08-31T02:09:36Z","timestamp":1661911776000},"page":"6557","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Reliability of the In Silico Prediction Approach to In Vitro Evaluation of Bacterial Toxicity"],"prefix":"10.3390","volume":"22","author":[{"given":"Sung-Yoon","family":"Ahn","sequence":"first","affiliation":[{"name":"Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Korea"}]},{"given":"Mira","family":"Kim","sequence":"additional","affiliation":[{"name":"Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea"}]},{"given":"Ji-Eun","family":"Bae","sequence":"additional","affiliation":[{"name":"Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5284-9709","authenticated-orcid":false,"given":"Iel-Soo","family":"Bang","sequence":"additional","affiliation":[{"name":"Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8117-6566","authenticated-orcid":false,"given":"Sang-Woong","family":"Lee","sequence":"additional","affiliation":[{"name":"Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1038\/sj.jea.7500244","article-title":"It\u2019s about time: A comparison of Canadian and American time\u2013activity patterns","volume":"12","author":"Leech","year":"2002","journal-title":"J. Expo. Sci. Environ. Epidemiol."},{"key":"ref_2","unstructured":"WHO (2020, January 28). Household Air Pollution and Health, Available online: https:\/\/www.who.int\/en\/news-room\/fact-sheets\/detail\/household-air-pollution-and-health."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1016\/j.physbeh.2004.02.007","article-title":"Energy balance and reproduction","volume":"81","author":"Schneider","year":"2004","journal-title":"Physiol. Behav."},{"key":"ref_4","first-page":"1","article-title":"Toxins from bacteria","volume":"2","author":"Henkel","year":"2010","journal-title":"Mol. Toxicol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.3109\/13693786.2012.698025","article-title":"Fungal hemolysins","volume":"51","author":"Nayak","year":"2013","journal-title":"Med. Mycol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1903","DOI":"10.1890\/06-1052.1","article-title":"Globalization of human infectious disease","volume":"88","author":"Smith","year":"2007","journal-title":"Ecology"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Farzanegan, M.R., Feizi, M., and Gholipour, H.F. (2021). Globalization and the outbreak of COVID-19: An empirical analysis. J. Risk Financ. Manag., 14.","DOI":"10.3390\/jrfm14030105"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1184","DOI":"10.1109\/TCBB.2018.2819660","article-title":"High-order convolutional neural network architecture for predicting DNA-protein binding sites","volume":"16","author":"Zhang","year":"2018","journal-title":"IEEE ACM Trans. Comput. Biol. Bioinform."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Gupta, S., Kapoor, P., Chaudhary, K., Gautam, A., Kumar, R., Open Source Drug Discovery Consortium, and Raghava, G.P. (2007). In Silico approach for predicting toxicity of peptides and proteins. PLoS ONE, 8.","DOI":"10.1371\/journal.pone.0073957"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"W363","DOI":"10.1093\/nar\/gkp299","article-title":"ClanTox: A classifier of short animal toxins","volume":"37","author":"Naamati","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"e7200","DOI":"10.7717\/peerj.7200","article-title":"TOXIFY: A deep learning approach to classify animal venom proteins","volume":"7","author":"Cole","year":"2019","journal-title":"PeerJ"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5159","DOI":"10.1093\/bioinformatics\/btaa656","article-title":"ToxDL: Deep learning using primary structure and domain embeddings for assessing protein toxicity","volume":"36","author":"Pan","year":"2021","journal-title":"Bioinformatics"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1514","DOI":"10.1093\/bioinformatics\/btac006","article-title":"ToxIBTL: Prediction of peptide toxicity based on information bottleneck and transfer learning","volume":"38","author":"Wei","year":"2022","journal-title":"Bioinformatics"},{"key":"ref_15","unstructured":"Ulrike, V.L., Isabelle, G., Samy, B., Hanna, W., and Rob, F. (2017). Attention is all you need. Advances in Neural Information Processing Systems, Curran Associates Inc."},{"key":"ref_16","unstructured":"Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_17","unstructured":"Bao, H., Dong, L., and Wei, F. (2021). Beit: Bert pre-training of image transformers. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Elnaggar, A., Heinzinger, M., Dallago, C., Rihawi, G., Wang, Y., Jones, L., Gibbs, T., Feher, T., Angerer, C., and Steinegger, M. (2020). ProtTrans: Towards cracking the language of Life\u2019s code through self-supervised deep learning and high performance computing. arXiv.","DOI":"10.1101\/2020.07.12.199554"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"551","DOI":"10.1016\/j.toxicon.2012.03.010","article-title":"The UniProtKB\/Swiss-Prot Tox-Prot program: A central hub of integrated venom protein data","volume":"60","author":"Jungo","year":"2012","journal-title":"Toxicon"},{"key":"ref_20","first-page":"405","article-title":"BTXpred: Prediction of bacterial toxins","volume":"7","author":"Saha","year":"2007","journal-title":"Silico Biol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"D912","DOI":"10.1093\/nar\/gkab1107","article-title":"VFDB 2022: A general classification scheme for bacterial virulence factors","volume":"50","author":"Liu","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Sharma, N., Naorem, L.D., Jain, S., and Raghava, G.P. (2022). ToxinPred2: An improved method for predicting toxicity of proteins. Brief. Bioinform., bbac174.","DOI":"10.1093\/bib\/bbac174"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1093\/nar\/gkg095","article-title":"The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003","volume":"31","author":"Boeckmann","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1107","DOI":"10.1093\/jnci\/82.13.1107","article-title":"New Colorimetric Cytotoxicity Assay for Anticancer-Drug Screening","volume":"82","author":"Skehan","year":"1990","journal-title":"J. Natl. Cancer Inst."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1038\/227168a0","article-title":"Characteristics of a human diploid cell designated MRC-5","volume":"227","author":"Jacobs","year":"1970","journal-title":"Nature"},{"key":"ref_26","first-page":"264","article-title":"Tissue culture studies of the proliferative capacity of cervical carcinoma and normal epithelium","volume":"12","author":"Gey","year":"1952","journal-title":"Cancer Res."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1038\/emm.2005.48","article-title":"Characterization of newly established oral cancer cell lines derived from six squamous cell carcinoma and two mucoepidermoid carcinoma cells","volume":"37","author":"Lee","year":"2005","journal-title":"Exp. Mol. Med."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"D36","DOI":"10.1093\/nar\/gks1195","article-title":"GenBank","volume":"41","author":"Benson","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., and Funtowicz, M. (2020, January 16\u201318). Transformers: State-of-the-art natural language processing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"ref_30","unstructured":"(2022, April 10). Fine-Tune and Deploy the ProtBERT Model for Protein Classification Using Amazon SageMaker. Available online: https:\/\/aws.amazon.com\/blogs\/machine-learning\/fine-tune-and-deploy-the-protbert-model-for-protein-classification-using-amazon-sagemaker\/."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: A new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1236","DOI":"10.1093\/bioinformatics\/btu031","article-title":"InterProScan 5: Genome-scale protein function classification","volume":"30","author":"Jones","year":"2014","journal-title":"Bioinformatics"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"W200","DOI":"10.1093\/nar\/gky448","article-title":"HMMER web server: 2018 update","volume":"46","author":"Potter","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Tiessen, A., P\u00e9rez-Rodr\u00edguez, P., and Delaye-Arredondo, L.J. (2012). Mathematical modeling and comparison of protein size distribution in different plant, animal, fungal and microbial species reveals a negative correlation between protein size and protein number, thus providing insight into the evolution of proteomes. BMC Res. Notes, 5.","DOI":"10.1186\/1756-0500-5-85"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/17\/6557\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:20:42Z","timestamp":1760142042000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/17\/6557"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,31]]},"references-count":34,"journal-issue":{"issue":"17","published-online":{"date-parts":[[2022,9]]}},"alternative-id":["s22176557"],"URL":"https:\/\/doi.org\/10.3390\/s22176557","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2022,8,31]]}}}