{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T18:05:22Z","timestamp":1775325922973,"version":"3.50.1"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,1,22]],"date-time":"2021-01-22T00:00:00Z","timestamp":1611273600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,1,22]],"date-time":"2021-01-22T00:00:00Z","timestamp":1611273600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100007847","name":"Natural Science Foundation of Jilin Province","doi-asserted-by":"crossref","award":["11571173"],"award-info":[{"award-number":["11571173"]}],"id":[{"id":"10.13039\/100007847","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Currently, large-scale gene expression profiling has been successfully applied to the discovery of functional connections among diseases, genetic perturbation, and drug action. To address the cost of an ever-expanding gene expression profile, a new, low-cost, high-throughput reduced representation expression profiling method called L1000 was proposed, with which one million profiles were produced. Although a set of\u2009~\u20091000 carefully chosen landmark genes that can capture\u2009~\u200980% of information from the whole genome has been identified for use in L1000, the robustness of using these landmark genes to infer target genes is not satisfactory. Therefore, more efficient computational methods are still needed to deep mine the influential genes in the genome.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Here, we propose a computational framework based on deep learning to mine a subset of genes that can cover more genomic information. Specifically, an AutoEncoder framework is first constructed to learn the non-linear relationship between genes, and then DeepLIFT is applied to calculate gene importance scores. Using this data-driven approach, we have re-obtained a landmark gene set. The result shows that our landmark genes can predict target genes more accurately and robustly than that of L1000 based on two metrics [mean absolute error (MAE) and Pearson correlation coefficient (PCC)]. This reveals that the landmark genes detected by our method contain more genomic information.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>We believe that our proposed framework is very suitable for the analysis of biological big data to reveal the mysteries of life. Furthermore, the landmark genes inferred from this study can be used for the explosive amplification of gene expression profiles to facilitate research into functional connections.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-021-03972-5","type":"journal-article","created":{"date-parts":[[2021,1,22]],"date-time":"2021-01-22T04:09:06Z","timestamp":1611288546000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Mining influential genes based on deep learning"],"prefix":"10.1186","volume":"22","author":[{"given":"Lingpeng","family":"Kong","sequence":"first","affiliation":[]},{"given":"Yuanyuan","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Fengjiao","family":"Xu","sequence":"additional","affiliation":[]},{"given":"Mingmin","family":"Xu","sequence":"additional","affiliation":[]},{"given":"Zutan","family":"Li","sequence":"additional","affiliation":[]},{"given":"Jingya","family":"Fang","sequence":"additional","affiliation":[]},{"given":"Liangyun","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Cong","family":"Pian","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,1,22]]},"reference":[{"issue":"23","key":"3972_CR1","doi-asserted-by":"publisher","first-page":"7285","DOI":"10.1073\/pnas.1507125112","volume":"112","author":"S Darmanis","year":"2015","unstructured":"Darmanis S, Sloan SA, Zhang Y, Enge M, Caneda C, Shuer LM, Hayden Gephart MG, Barres BA, Quake SR. A survey of human brain transcriptome diversity at the single cell level. Proc Natl Acad Sci U S A. 2015;112(23):7285\u201390.","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"4","key":"3972_CR2","doi-asserted-by":"publisher","first-page":"320","DOI":"10.1038\/ng.3225","volume":"47","author":"A Calon","year":"2015","unstructured":"Calon A, Lonardo E, Berenguer-Llergo A, Espinet E, Hernando-Momblona X, Iglesias M, Sevillano M, Palomo-Ponce S, Tauriello DV, Byrom D, et al. Stromal gene expression defines poor-prognosis subtypes in colorectal cancer. Nat Genet. 2015;47(4):320\u20139.","journal-title":"Nat Genet"},{"issue":"5795","key":"3972_CR3","doi-asserted-by":"publisher","first-page":"1929","DOI":"10.1126\/science.1132939","volume":"313","author":"J Lamb","year":"2006","unstructured":"Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science (New York, NY). 2006;313(5795):1929\u201335.","journal-title":"Science (New York, NY)"},{"issue":"1","key":"3972_CR4","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1186\/s13059-016-0970-8","volume":"17","author":"V Ntranos","year":"2016","unstructured":"Ntranos V, Kamath GM, Zhang JM, Pachter L, Tse DN. Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts. Genome Biol. 2016;17(1):112.","journal-title":"Genome Biol"},{"issue":"4","key":"3972_CR5","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1016\/j.cels.2016.04.001","volume":"2","author":"G Heimberg","year":"2016","unstructured":"Heimberg G, Bhatnagar R, El-Samad H, Thomson M. Low Dimensionality in Gene Expression Data Enables the Accurate Extraction of Transcriptional Programs from Shallow Sequencing. Cell Syst. 2016;2(4):239\u201350.","journal-title":"Cell Syst"},{"issue":"2","key":"3972_CR6","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1016\/j.neuron.2016.10.001","volume":"92","author":"S Shah","year":"2016","unstructured":"Shah S, Lubeck E, Zhou W, Cai L. In situ transcription profiling of single cells reveals spatial organization of cells in the mouse hippocampus. Neuron. 2016;92(2):342\u201357.","journal-title":"Neuron"},{"key":"3972_CR7","doi-asserted-by":"crossref","unstructured":"Subramanian A, Narayan R, Corsello SM, Peck DD, Natoli TE, Lu X, Gould J, Davis JF, Tubelli AA, Asiedu JK et al: A Next generation connectivity map: L1000 platform and the first 1,000,000 Profiles. Cell 2017, 171(6):1437\u20131452 e1417.","DOI":"10.1016\/j.cell.2017.10.049"},{"issue":"1","key":"3972_CR8","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1093\/nar\/30.1.207","volume":"30","author":"R Edgar","year":"2002","unstructured":"Edgar R, Domrachev M, Lash AE. Gene expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30(1):207\u201310.","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"3972_CR9","doi-asserted-by":"publisher","first-page":"1832","DOI":"10.1093\/bioinformatics\/btw074","volume":"32","author":"Y Chen","year":"2016","unstructured":"Chen Y, Li Y, Narayan R, Subramanian A, Xie X. Gene expression inference with deep learning. Bioinformatics. 2016;32(12):1832\u20139.","journal-title":"Bioinformatics"},{"issue":"17","key":"3972_CR10","doi-asserted-by":"publisher","first-page":"i603","DOI":"10.1093\/bioinformatics\/bty563","volume":"34","author":"X Wang","year":"2018","unstructured":"Wang X, Ghasedi Dizaji K, Huang H. Conditional generative adversarial network for gene expression inference. Bioinformatics. 2018;34(17):i603\u201311.","journal-title":"Bioinformatics"},{"issue":"15","key":"3972_CR11","doi-asserted-by":"publisher","first-page":"1811","DOI":"10.1093\/bioinformatics\/btq273","volume":"26","author":"H Brunel","year":"2010","unstructured":"Brunel H, Gallardo-Chacon JJ, Buil A, Vallverdu M, Soria JM, Caminal P, Perera A. MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis. Bioinformatics. 2010;26(15):1811\u20138.","journal-title":"Bioinformatics"},{"issue":"5","key":"3972_CR12","first-page":"851","volume":"18","author":"S Min","year":"2017","unstructured":"Min S, Lee B, Yoon S. Deep learning in bioinformatics. Brief Bioinform. 2017;18(5):851\u201369.","journal-title":"Brief Bioinform"},{"issue":"6","key":"3972_CR13","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1145\/3065386","volume":"60","author":"A Krizhevsky","year":"2017","unstructured":"Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017;60(6):84\u201390.","journal-title":"Commun ACM"},{"key":"3972_CR14","first-page":"28","volume-title":"Cho K","author":"J Chorowski","year":"2015","unstructured":"Chorowski J, Bahdanau D, Serdyuk D. Cho K. Bengio Y: Attention-based models for speech recognition. Adv Neur In; 2015. p. 28."},{"key":"3972_CR15","doi-asserted-by":"crossref","unstructured":"Li JW, Luong MT, Jurafsky D: A Hierarchical neural autoencoder for paragraphs and documents. Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, Vol 1 2015, 1:1106\u20131115.","DOI":"10.3115\/v1\/P15-1107"},{"key":"3972_CR16","first-page":"2493","volume":"12","author":"R Collobert","year":"2011","unstructured":"Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural Language Processing (Almost) from Scratch. J Mach Learn Res. 2011;12:2493\u2013537.","journal-title":"J Mach Learn Res"},{"issue":"7","key":"3972_CR17","doi-asserted-by":"publisher","first-page":"990","DOI":"10.1101\/gr.200535.115","volume":"26","author":"DR Kelley","year":"2016","unstructured":"Kelley DR, Snoek J, Rinn JL. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 2016;26(7):990\u20139.","journal-title":"Genome Res"},{"issue":"7","key":"3972_CR18","doi-asserted-by":"publisher","first-page":"1125","DOI":"10.1093\/bioinformatics\/bty752","volume":"35","author":"M Kalkatawi","year":"2019","unstructured":"Kalkatawi M, Magana-Mora A, Jankovic B, Bajic VB. DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions. Bioinformatics. 2019;35(7):1125\u201332.","journal-title":"Bioinformatics"},{"issue":"24","key":"3972_CR19","doi-asserted-by":"publisher","first-page":"5067","DOI":"10.1093\/bioinformatics\/btz451","volume":"35","author":"J Zhou","year":"2019","unstructured":"Zhou J, Lu Q, Gui L, Xu R, Long Y, Wang H. MTTFsite: cross-cell type TF binding site prediction by using multi-task learning. Bioinformatics. 2019;35(24):5067\u201377.","journal-title":"Bioinformatics"},{"issue":"16","key":"3972_CR20","doi-asserted-by":"publisher","first-page":"2730","DOI":"10.1093\/bioinformatics\/bty1068","volume":"35","author":"R Umarov","year":"2019","unstructured":"Umarov R, Kuwahara H, Li Y, Gao X, Solovyev V. Promoter analysis and prediction in the human genome using sequence-based deep learning models. Bioinformatics. 2019;35(16):2730\u20137.","journal-title":"Bioinformatics"},{"issue":"10","key":"3972_CR21","doi-asserted-by":"publisher","first-page":"931","DOI":"10.1038\/nmeth.3547","volume":"12","author":"J Zhou","year":"2015","unstructured":"Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods. 2015;12(10):931\u20134.","journal-title":"Nat Methods"},{"issue":"22","key":"3972_CR22","doi-asserted-by":"publisher","first-page":"3873","DOI":"10.1093\/bioinformatics\/bty440","volume":"34","author":"V Gligorijevic","year":"2018","unstructured":"Gligorijevic V, Barot M, Bonneau R. deepNF: deep network fusion for protein function prediction. Bioinformatics. 2018;34(22):3873\u201381.","journal-title":"Bioinformatics"},{"issue":"Suppl 1","key":"3972_CR23","doi-asserted-by":"publisher","first-page":"9","DOI":"10.1186\/s12859-015-0852-1","volume":"17","author":"L Chen","year":"2016","unstructured":"Chen L, Cai C, Chen V, Lu X. Learning a hierarchical representation of the yeast transcriptomic machinery using an autoencoder model. BMC Bioinformatics. 2016;17(Suppl 1):9.","journal-title":"BMC Bioinformatics"},{"issue":"2","key":"3972_CR24","first-page":"15","volume":"7","author":"M Khalili","year":"2016","unstructured":"Khalili M, Alavi Majd H, Khodakarim S, Ahadi B, Hamidpour M. Prediction of the thromboembolic syndrome: an application of artificial neural networks in gene expression data analysis. Arch Adv Biosci (Journal of Paramedical Sciences). 2016;7(2):15\u201322.","journal-title":"Arch Adv Biosci (Journal of Paramedical Sciences)"},{"key":"3972_CR25","doi-asserted-by":"publisher","first-page":"26094","DOI":"10.1038\/srep26094","volume":"6","author":"R Miotto","year":"2016","unstructured":"Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep. 2016;6:26094.","journal-title":"Sci Rep"},{"key":"3972_CR26","doi-asserted-by":"crossref","unstructured":"Chen Q, Song X, Yamada H, Shibasaki R: Learning deep representation from big and heterogeneous data for traffic accident inference; 2016.","DOI":"10.1609\/aaai.v30i1.10011"},{"issue":"6","key":"3972_CR27","doi-asserted-by":"publisher","first-page":"1248","DOI":"10.1158\/1078-0432.CCR-17-0853","volume":"24","author":"K Chaudhary","year":"2018","unstructured":"Chaudhary K, Poirion OB, Lu L, Garmire LX. Deep learning-based multi-omics integration robustly predicts survival in liver cancer. Clin Cancer Res. 2018;24(6):1248\u201359.","journal-title":"Clin Cancer Res"},{"key":"3972_CR28","doi-asserted-by":"crossref","unstructured":"Zeiler M, Fergus R: Visualizing and understanding convolutional neural networks, vol. 8689; 2013.","DOI":"10.1007\/978-3-319-10590-1_53"},{"key":"3972_CR29","unstructured":"Springenberg J, Dosovitskiy A, Brox T, Riedmiller M: Striving for simplicity: The all convolutional net. 2014."},{"key":"3972_CR30","unstructured":"Simonyan K, Vedaldi A, Zisserman A: Deep inside convolutional networks: visualising image classification models and saliency maps. preprint 2013."},{"key":"3972_CR31","unstructured":"Shrikumar A, Greenside P, Kundaje A: Learning important features through propagating activation differences. 2017."},{"issue":"24","key":"3972_CR32","doi-asserted-by":"publisher","first-page":"4180","DOI":"10.1093\/bioinformatics\/bty497","volume":"34","author":"J Zuallaert","year":"2018","unstructured":"Zuallaert J, Godin F, Kim M, Soete A, Saeys Y, De Neve W. SpliceRover: interpretable convolutional neural networks for improved splice site prediction. Bioinformatics. 2018;34(24):4180\u20138.","journal-title":"Bioinformatics"},{"key":"3972_CR33","unstructured":"Gene expression inference with deep learning. Bioinformatics 2016."},{"key":"3972_CR34","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1016\/B978-1-55860-335-6.50023-4","volume-title":"Machine learning proceedings 1994","author":"GH John","year":"1994","unstructured":"John GH, Kohavi R, Pfleger K. Irrelevant Features and the Subset Selection Problem. In: Cohen WW, Hirsh H, editors. Machine learning proceedings 1994. San Francisco (CA): Morgan Kaufmann; 1994. p. 121\u20139."},{"key":"3972_CR35","unstructured":"Liaw A, Wiener M: Classification and regression by RandomForest. Forest 2001, 23."},{"issue":"2","key":"3972_CR36","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1093\/bioinformatics\/19.2.185","volume":"19","author":"BM Bolstad","year":"2003","unstructured":"Bolstad BM, Irizarry RA, Astrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19(2):185\u201393.","journal-title":"Bioinformatics"},{"issue":"16","key":"3972_CR37","doi-asserted-by":"publisher","first-page":"2796","DOI":"10.1093\/bioinformatics\/btz015","volume":"35","author":"W Chen","year":"2019","unstructured":"Chen W, Lv H, Nie F, Lin H. i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics. 2019;35(16):2796\u2013800.","journal-title":"Bioinformatics"},{"key":"3972_CR38","unstructured":"Kingma D, Ba J: Adam: a method for stochastic optimization. International Conference on Learning Representations 2014."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-03972-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-021-03972-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-03972-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,22]],"date-time":"2024-08-22T16:53:23Z","timestamp":1724345603000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-021-03972-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,22]]},"references-count":38,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["3972"],"URL":"https:\/\/doi.org\/10.1186\/s12859-021-03972-5","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-50807\/v1","asserted-by":"object"},{"id-type":"doi","id":"10.21203\/rs.3.rs-50807\/v2","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1,22]]},"assertion":[{"value":"29 July 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 January 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 January 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"27"}}