{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:10:22Z","timestamp":1760238622991,"version":"build-2065373602"},"reference-count":46,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2020,8,27]],"date-time":"2020-08-27T00:00:00Z","timestamp":1598486400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004281","name":"Narodowe Centrum Nauki","doi-asserted-by":"publisher","award":["2016\/21\/P\/NZ2\/03926"],"award-info":[{"award-number":["2016\/21\/P\/NZ2\/03926"]}],"id":[{"id":"10.13039\/501100004281","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>The gene is a fundamental concept of genetics, which emerged with the Mendelian paradigm of heredity at the beginning of the 20th century. However, the concept has since diversified. Somewhat different narratives and models of the gene developed in several sub-disciplines of genetics, that is in classical genetics, population genetics, molecular genetics, genomics, and, recently, also, in systems genetics. Here, I ask how the diversity of the concept impacts data-integration and data-mining strategies for bioinformatics, genomics, statistical genetics, and data science. I also consider theoretical background of the concept of the gene in the ideas of empiricism and experimentalism, as well as reductionist and anti-reductionist narratives on the concept. Finally, a few strategies of analysis from published examples of data-mining projects are discussed. Moreover, the examples are re-interpreted in the light of the theoretical material. I argue that the choice of an optimal level of abstraction for the gene is vital for a successful genome analysis.<\/jats:p>","DOI":"10.3390\/e22090942","type":"journal-article","created":{"date-parts":[[2020,8,28]],"date-time":"2020-08-28T09:17:08Z","timestamp":1598606228000},"page":"942","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Models of the Gene Must Inform Data-Mining Strategies in Genomics"],"prefix":"10.3390","volume":"22","author":[{"given":"\u0141ukasz","family":"Huminiecki","sequence":"first","affiliation":[{"name":"Department of Molecular Biology, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, 00-901 Warsaw, Poland"}]}],"member":"1968","published-online":{"date-parts":[[2020,8,27]]},"reference":[{"key":"ref_1","first-page":"3","article-title":"Versuche \u00fcber Pflanzenhybriden","volume":"IV","author":"Mendel","year":"1866","journal-title":"Verhandlungen Naturforschenden Vereines Br\u00fcnn"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Hull, D.L., and Ruse, M. (2007). Gene. The Cambridge Companion to the Philosophy of Biology, Cambridge University Press.","DOI":"10.1017\/CCOL9780521851282"},{"key":"ref_3","unstructured":"Ptashne, M.G.A. (2002). Genes and Signals, Cold Spring Harbor Laboratory Press."},{"key":"ref_4","first-page":"67","article-title":"Max Ludwig Henning Delbruck\u2014September 4, 1906-March 10, 1981","volume":"62","author":"Hayes","year":"1992","journal-title":"Biogr. Mem. Natl. Acad. Sci."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1007\/s11017-006-9020-y","article-title":"Genes in the postgenomic era","volume":"27","author":"Griffiths","year":"2006","journal-title":"Theor. Med. Bioeth."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1007\/BF00165258","article-title":"Are Genes Units of Inheritance","volume":"5","author":"Fogle","year":"1990","journal-title":"Biol. Philos."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Engstrom, P.G., Suzuki, H., Ninomiya, N., Akalin, A., Sessa, L., Lavorgna, G., Brozzi, A., Luzi, L., Tan, S.L., and Yang, L. (2006). Complex Loci in human and mouse genomes. PLoS Genet., 2.","DOI":"10.1371\/journal.pgen.0020047"},{"key":"ref_8","unstructured":"and Barnes, J. (1995). The Complete Works of Aristotle: The Revised Oxford Translation, Princeton University Press."},{"key":"ref_9","unstructured":"Bacon, F. (1620). George Fabyan Collection (Library of Congress). Francisci de Verulamio, Summi Angliae Cancellarii, Instauratio Magna, Apud Joannem Billium, Typographum Regium."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"D5","DOI":"10.1093\/nar\/gkl1031","article-title":"Database resources of the National Center for Biotechnology Information","volume":"35","author":"Wheeler","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1534\/genetics.109.101659","article-title":"Thomas Hunt Morgan at the marine biological laboratory: Naturalist and experimentalist","volume":"181","author":"Kenney","year":"2009","journal-title":"Genetics"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1002\/jez.1400140104","article-title":"The linear arrangement of six sex-linked factors in Drosophila, as shown by their mode of association","volume":"14","author":"Sturtevant","year":"1913","journal-title":"J. Exp. Zool."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Fisher, R.A., and Bennett, J.H. (1990). Statistical Methods, Experimental Design, and Scientific Inference, Oxford University Press.","DOI":"10.1093\/oso\/9780198522294.001.0001"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Uebel, T.E., and Richardson, A.W. (2007). The Cambridge Companion to Logical Empiricism, Cambridge University Press.","DOI":"10.1017\/CCOL0521791782"},{"key":"ref_15","unstructured":"Kuhn, T.S. (1962). The Structure of Scientific Revolutions, University of Chicago Press."},{"key":"ref_16","first-page":"14","article-title":"The Development of Genetics in the Light of Thomas Kuhn\u2019s Theory of Scientific Revolutions","volume":"9","author":"Portin","year":"2015","journal-title":"Recent Adv. DNA Gene Seq."},{"key":"ref_17","unstructured":"Nagel, E. (1961). The Structure of Science: Problems in the Logic of Scientific Explanation, Routledge & Kegan Paul."},{"key":"ref_18","unstructured":"Schaffner, K.F. (1993). Discovery and Explanation in Biology and Medicine, University of Chicago Press."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Lehmann, E.L. (2011). Fisher, Neyman, and the Creation of Classical Statistics, Springer.","DOI":"10.1007\/978-1-4419-9500-1"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Efron, B., and Hastie, T. (2016). Computer Age Statistical Inference: Algorithms, Evidence, and Data Science, Cambridge University Press.","DOI":"10.1017\/CBO9781316576533"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Edgington, E.S. (2007). Randomization Tests, CRC Press.","DOI":"10.1201\/9781420011814"},{"key":"ref_22","unstructured":"Kuhn, T.S. (1985). The Copernican Revolution\u2014Planetary Astronomy in the Development of Western Thought, Harvard University Press."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1093\/jhered\/esw058","article-title":"Are Mendel\u2019s Data Reliable? The Perspective of a Pea Geneticist","volume":"107","author":"Weeden","year":"2016","journal-title":"J. Hered."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"787","DOI":"10.1038\/248787a0","article-title":"Rosalind Franklin and the double helix","volume":"248","author":"Klug","year":"1974","journal-title":"Nature"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1186\/s13059-014-0413-3","article-title":"A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators","volume":"15","author":"Hurst","year":"2014","journal-title":"Genome Biol."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Huminiecki, L. (2018). Modelling of the breadth of expression from promoter architectures identifies pro-housekeeping transcription factors. PLoS ONE, 13.","DOI":"10.1371\/journal.pone.0198961"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1796","DOI":"10.1101\/gr.150700","article-title":"In silico cloning of novel endothelial-specific genes","volume":"10","author":"Huminiecki","year":"2000","journal-title":"Genome Res."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1006\/geno.2002.6745","article-title":"Magic roundabout is a new member of the roundabout receptor family that is endothelial specific and expressed at sites of active angiogenesis","volume":"79","author":"Huminiecki","year":"2002","journal-title":"Genomics"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Huminiecki, L. (2019). Magic roundabout is an endothelial-specific ohnolog of ROBO1 which neo-functionalized to an essential new role in angiogenesis. PLoS ONE, 14.","DOI":"10.1371\/journal.pone.0208952"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Huminiecki, L., Goldovsky, L., Freilich, S., Moustakas, A., Ouzounis, C., and Heldin, C.H. (2009). Emergence, development and diversification of the TGF-beta signalling pathway within the animal kingdom. BMC Evol. Biol., 9.","DOI":"10.1186\/1471-2148-9-28"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Huminiecki, L., and Heldin, C.H. (2010). 2R and remodeling of vertebrate signal transduction engine. BMC Biol., 8.","DOI":"10.1186\/1741-7007-8-146"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1111\/jth.12253","article-title":"Evolutionary origins of the blood vascular system and endothelium","volume":"11","author":"Dvorak","year":"2013","journal-title":"J. Thromb. Haemost."},{"key":"ref_33","unstructured":"(2020, August 25). FANTOM5 Presentation of CAGE Technology. Available online: http:\/\/fantom.gsc.riken.jp\/protocols\/."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"462","DOI":"10.1038\/nature13182","article-title":"A promoter-level mammalian expression atlas","volume":"507","author":"Forrest","year":"2014","journal-title":"Nature"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1016\/j.tibtech.2017.03.007","article-title":"Can We Predict Gene Expression by Understanding Proximal Promoter Architecture?","volume":"35","author":"Huminiecki","year":"2017","journal-title":"Trends Biotechnol."},{"key":"ref_36","unstructured":"Ohno, S. (2013). Evolution by Gene Duplication, Springer."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"938","DOI":"10.1038\/nrg2482","article-title":"Turning a hobby into a job: How duplicated genes find new functions","volume":"9","author":"Conant","year":"2008","journal-title":"Nat. Rev. Genet."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1038\/nrg1272","article-title":"Network biology: Understanding the cell\u2019s functional organization","volume":"5","author":"Barabasi","year":"2004","journal-title":"Nat. Rev. Genet."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nrg2918","article-title":"Network medicine: A network-based approach to human disease","volume":"12","author":"Barabasi","year":"2011","journal-title":"Nat. Rev. Genet."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1056\/NEJMe078114","article-title":"Network medicine\u2014From obesity to the \u201cdiseasome\u201d","volume":"357","author":"Barabasi","year":"2007","journal-title":"N. Engl. J. Med."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"W749","DOI":"10.1093\/nar\/gkq428","article-title":"GSA-SNP: A general approach for gene set analysis of polymorphisms","volume":"38","author":"Nam","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"e60","DOI":"10.1093\/nar\/gky175","article-title":"Efficient pathway enrichment and network analysis of GWAS summary data using GSA-SNP2","volume":"46","author":"Yoon","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1222","DOI":"10.1101\/gr.128819.111","article-title":"The human phosphotyrosine signaling network: Evolution and hotspots of hijacking in cancer","volume":"22","author":"Li","year":"2012","journal-title":"Genome Res."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1038\/msb4100200","article-title":"A map of human cancer signaling","volume":"3","author":"Cui","year":"2007","journal-title":"Mol. Syst. Biol."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Picart-Armada, S., Barrett, S.J., Wille, D.R., Perera-Lluna, A., Gutteridge, A., and Dessailly, B.H. (2019). Benchmarking network propagation methods for disease gene identification. PLoS Comput. Biol., 15.","DOI":"10.1371\/journal.pcbi.1007276"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hill, A., Gleim, S., Kiefer, F., Sigoillot, F., Loureiro, J., Jenkins, J., and Morris, M.K. (2019). Benchmarking network algorithms for contextualizing genes of interest. PLoS Comput. Biol., 15.","DOI":"10.1371\/journal.pcbi.1007403"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/9\/942\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:03:56Z","timestamp":1760177036000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/9\/942"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,27]]},"references-count":46,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["e22090942"],"URL":"https:\/\/doi.org\/10.3390\/e22090942","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2020,8,27]]}}}