{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T08:06:55Z","timestamp":1770019615149,"version":"3.49.0"},"reference-count":45,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2023,8,27]],"date-time":"2023-08-27T00:00:00Z","timestamp":1693094400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>This study proposes a hybrid gene selection method to identify and predict key genes in Arabidopsis associated with various stresses (including salt, heat, cold, high-light, and flagellin), aiming to enhance crop tolerance. An open-source microarray dataset (GSE41935) comprising 207 samples and 30,380 genes was analyzed using several machine learning tools including the synthetic minority oversampling technique (SMOTE), information gain (IG), ReliefF, and least absolute shrinkage and selection operator (LASSO), along with various classifiers (BayesNet, logistic, multilayer perceptron, sequential minimal optimization (SMO), and random forest). We identified 439 differentially expressed genes (DEGs), of which only three were down-regulated (AT3G20810, AT1G31680, and AT1G30250). The performance of the top 20 genes selected by IG and ReliefF was evaluated using the classifiers mentioned above to classify stressed versus non-stressed samples. The random forest algorithm outperformed other algorithms with an accuracy of 97.91% and 98.51% for IG and ReliefF, respectively. Additionally, 42 genes were identified from all 30,380 genes using LASSO regression. The top 20 genes for each feature selection were analyzed to determine three common genes (AT5G44050, AT2G47180, and AT1G70700), which formed a three-gene signature. The efficiency of these three genes was evaluated using random forest and XGBoost algorithms. Further validation was performed using an independent RNA_seq dataset and random forest. These gene signatures can be exploited in plant breeding to improve stress tolerance in a variety of crops.<\/jats:p>","DOI":"10.3390\/a16090407","type":"journal-article","created":{"date-parts":[[2023,8,28]],"date-time":"2023-08-28T05:46:47Z","timestamp":1693201607000},"page":"407","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["A Novel Machine-Learning Approach to Predict Stress-Responsive Genes in Arabidopsis"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0940-8486","authenticated-orcid":false,"given":"Leyla","family":"Nazari","sequence":"first","affiliation":[{"name":"Crop and Horticultural Science Research Department, Fars Agricultural and Natural Resources Research and Education Center, Agricultural Research, Education and Extension Organization (AREEO), Shiraz 71558-63511, Iran"}]},{"given":"Vida","family":"Ghotbi","sequence":"additional","affiliation":[{"name":"Agricultural Research, Education and Extension Organization (AREEO), Seed and Plant Improvement Institute, Karaj 31359-33151, Iran"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4550-7572","authenticated-orcid":false,"given":"Mohammad","family":"Nadimi","sequence":"additional","affiliation":[{"name":"Department of Biosystems Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1665-3626","authenticated-orcid":false,"given":"Jitendra","family":"Paliwal","sequence":"additional","affiliation":[{"name":"Department of Biosystems Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada"}]}],"member":"1968","published-online":{"date-parts":[[2023,8,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1093\/jxb\/eru489","article-title":"Multidimensional approaches for studying plant defence against insects: From ecology to omics and synthetic biology","volume":"66","author":"Barah","year":"2015","journal-title":"J. Exp. Bot."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Mosa, K.A., Ismail, A., and Helmy, M. (2017). Plant Stress Tolerance: An Integrated Omics Approach, Springer International Publishing.","DOI":"10.1007\/978-3-319-59379-1"},{"key":"ref_3","unstructured":"Tran, Q.N., and Arabnia, H. (2015). Emerging Trends in Computational Biology, Bioinformatics, and Systems Biology, Morgan Kaufmann."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1111\/nph.12797","article-title":"Abiotic and biotic stress combinations","volume":"203","author":"Suzuki","year":"2014","journal-title":"New Phytol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1002\/dvg.1020070402","article-title":"Changes in plant gene expression during stress","volume":"7","author":"Matters","year":"1986","journal-title":"Dev. Genet."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"570","DOI":"10.1016\/j.tig.2003.08.006","article-title":"Comparison and meta-analysis of microarray data: From the bench to the computer desk","volume":"19","author":"Moreau","year":"2003","journal-title":"Trends Genet."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3147","DOI":"10.1093\/nar\/gkv1463","article-title":"Transcriptional regulatory networks in Arabidopsis thaliana during single and combined stresses","volume":"44","author":"Barah","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1111\/tpj.13167","article-title":"Transcriptome dynamics of Arabidopsis during sequential biotic and abiotic stresses","volume":"86","author":"Coolen","year":"2016","journal-title":"Plant J."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1783","DOI":"10.1104\/pp.112.210773","article-title":"Transcriptome Responses to Combinations of Stresses in Arabidopsis","volume":"161","author":"Rasmussen","year":"2013","journal-title":"Plant Physiol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1055\/s-0040-1714414","article-title":"Advances in Transcriptomics in the Response to Stress in Plants","volume":"07","author":"Wang","year":"2020","journal-title":"Glob. Med. Genet."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Mallik, S., and Zhao, Z. (2018). Identification of gene signatures from RNA-seq data using Pareto-optimal cluster algorithm. BMC Syst. Biol., 12.","DOI":"10.1186\/s12918-018-0650-2"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Khalid, S., Khalil, T., and Nasreen, S. (2014, January 27\u201329). A survey of feature selection and feature extraction techniques in machine learning. Proceedings of the Science and Information Conference (SAI), London, UK.","DOI":"10.1109\/SAI.2014.6918213"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"603808","DOI":"10.3389\/fgene.2020.603808","article-title":"Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions","volume":"11","author":"Mahendran","year":"2020","journal-title":"Front. Genet."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"e00154","DOI":"10.1002\/pld3.154","article-title":"Network-based feature selection reveals substructures of gene modules responding to salt stress in rice","volume":"3","author":"Du","year":"2019","journal-title":"Plant Direct"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1186\/s40537-021-00472-4","article-title":"Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest","volume":"8","author":"Prasetiyowati","year":"2021","journal-title":"J. Big Data"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1016\/j.jbi.2018.07.014","article-title":"Relief-based feature selection: Introduction and review","volume":"85","author":"Urbanowicz","year":"2018","journal-title":"J. Biomed. Inform."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1105\/tpc.15.00910","article-title":"Time-Series Transcriptomics Reveals That AGAMOUS-LIKE22 Affects Primary Metabolism and Developmental Processes in Drought-Stressed Arabidopsis","volume":"28","author":"Bechtold","year":"2016","journal-title":"Plant Cell"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"893","DOI":"10.1105\/tpc.112.096180","article-title":"Physiological Genomics of Response to Soil Drying in Diverse Arabidopsis Accessions","volume":"24","author":"Marais","year":"2012","journal-title":"Plant Cell"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1158352","DOI":"10.3389\/fgene.2023.1158352","article-title":"Gene filtering strategies for machine learning guided biomarker discovery using neonatal sepsis RNA-seq data","volume":"14","author":"Parkinson","year":"2023","journal-title":"Front. Genet."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Blagus, R., and Lusa, L. (2013). SMOTE for high-dimensional class-imbalanced data. BMC Bioinform., 14.","DOI":"10.1186\/1471-2105-14-106"},{"key":"ref_21","unstructured":"Bouckaert, R.R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., and Scuse, D. (2016). WEKA Manual for Version 3-9-1, University of Waikato."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1456","DOI":"10.1214\/13-EJS815","article-title":"The lasso problem and uniqueness","volume":"7","author":"Tibshirani","year":"2013","journal-title":"Electron. J. Stat."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1016\/j.neucom.2016.08.089","article-title":"Gene selection using information gain and improved simplified swarm optimization","volume":"218","author":"Lai","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_25","unstructured":"McDonald, C. (1998). Computer Science \u201998 Proceedings of the 21st Australasian Computer Science Conference ACSC\u201998, Perth, Australia, 4\u20136 February 1998, Springer."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1023\/A:1025667309714","article-title":"Theoretical and Empirical Analysis of ReliefF and RReliefF","volume":"53","author":"Kononenko","year":"2003","journal-title":"Mach. Learn."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"3940","DOI":"10.1093\/bioinformatics\/bti623","article-title":"ROCR: Visualizing classifier performance in R","volume":"21","author":"Sing","year":"2005","journal-title":"Bioinformatics"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An Introduction to Statistical Learning, Springer.","DOI":"10.1007\/978-1-4614-7138-7"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v036.i11","article-title":"Feature Selection with the Boruta Package","volume":"36","author":"Kursa","year":"2010","journal-title":"J. Stat. Softw."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2164","DOI":"10.1093\/plcell\/koab113","article-title":"Time of the day prioritizes the pool of translating mRNAs in response to heat stress","volume":"33","author":"Bonnot","year":"2021","journal-title":"Plant Cell"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1007\/s11749-016-0481-7","article-title":"A random forest guided tour","volume":"25","author":"Biau","year":"2016","journal-title":"TEST"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"256","DOI":"10.3389\/fgene.2019.00256","article-title":"A Machine Learning Approach for Identifying Gene Biomarkers Guiding the Treatment of Breast Cancer","volume":"10","author":"Tabl","year":"2019","journal-title":"Front. Genet."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1186\/s12967-022-03369-9","article-title":"XGBoost-based and tumor-immune characterized gene signature for the prediction of metastatic status in breast cancer","volume":"20","author":"Li","year":"2022","journal-title":"J. Transl. Med."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"3964","DOI":"10.1038\/srep03964","article-title":"Expression of OsMATE1 and OsMATE2 alters development, stress responses and pathogen susceptibility in Arabidopsis","volume":"4","author":"Tiwari","year":"2014","journal-title":"Sci. Rep."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Magwanga, R.O., Lu, P., Kirungu, J.N., Lu, H., Wang, X., Cai, X., Zhou, Z., Zhang, Z., Salih, H., and Wang, K. (2018). Characterization of the late embryogenesis abundant (LEA) proteins family and their role in drought stress tolerance in upland cotton. BMC Genet., 19.","DOI":"10.1186\/s12863-017-0596-1"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1007\/s13580-021-00413-3","article-title":"Genome-wide identification and comparative analysis of MATE gene family in Cucurbitaceae species and their regulatory role in melon (Cucumis melo) under salt stress","volume":"63","author":"Shah","year":"2022","journal-title":"Hortic. Environ. Biotechnol."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1046\/j.0960-7412.2001.01227.x","article-title":"Important roles of drought- and cold-inducible genes for galactinol synthase in stress tolerance in Arabidopsis thaliana","volume":"29","author":"Taji","year":"2002","journal-title":"Plant J."},{"key":"ref_38","unstructured":"Janse van Rensburg, H.C. (2016). The Arabidopsis GolS1 Promotor as a Potential Biosensor for Heat Stress and Fungal Infection?. [Master\u2019s Thesis, Stellenbosch University]."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1070","DOI":"10.1071\/FP21334","article-title":"Harboured cation\/proton antiporters modulate stress response to integrated heat and salt via up-regulating KIN1 and GOLS1 in double transgenic Arabidopsis","volume":"49","author":"Kahraman","year":"2022","journal-title":"Funct. Plant Biol."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Chini, A., Ben-Romdhane, W., Hassairi, A., and Aboul-Soud, M.A.M. (2017). Identification of TIFY\/JAZ family genes in Solanum lycopersicum and their regulation in response to abiotic stresses. PLoS ONE, 12.","DOI":"10.1371\/journal.pone.0177381"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Ebel, C., BenFeki, A., Hanin, M., Solano, R., and Chini, A. (2018). Characterization of wheat (Triticum aestivum) TIFY family and role of Triticum Durum TdTIFY11a in salt stress tolerance. PLoS ONE, 13.","DOI":"10.1371\/journal.pone.0200566"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1007\/s11103-009-9524-8","article-title":"Identification and expression profiling analysis of TIFY family genes involved in stress and phytohormone responses in rice","volume":"71","author":"Ye","year":"2009","journal-title":"Plant Mol. Biol."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"100043","DOI":"10.1016\/j.meafoo.2022.100043","article-title":"A unified heuristic approach to simultaneously detect fusarium and ergot damage in wheat","volume":"7","author":"Erkinbaev","year":"2022","journal-title":"Meas. Food"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"2495","DOI":"10.1111\/1541-4337.13150","article-title":"Enhancing traceability of wheat quality through the supply chain","volume":"22","author":"Nadimi","year":"2023","journal-title":"Compr. Rev. Food Sci. Food Saf."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Nadimi, M., Loewen, G., Bhowmik, P., and Paliwal, J. (2022). Effect of Laser Biostimulation on Germination of Sub-Optimally Stored Flaxseeds (Linum usitatissimum). Sustainability, 14.","DOI":"10.3390\/su141912183"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/9\/407\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:39:56Z","timestamp":1760128796000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/9\/407"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,27]]},"references-count":45,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2023,9]]}},"alternative-id":["a16090407"],"URL":"https:\/\/doi.org\/10.3390\/a16090407","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,27]]}}}