{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T17:43:09Z","timestamp":1750268589439,"version":"3.37.3"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,9,12]],"date-time":"2022-09-12T00:00:00Z","timestamp":1662940800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,12]],"date-time":"2022-09-12T00:00:00Z","timestamp":1662940800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","award":["JP18H05214","JP18H05214"],"award-info":[{"award-number":["JP18H05214","JP18H05214"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Epigenetic modifications established in mammalian gametes are largely reprogrammed during early development, however, are partly inherited by the embryo to support its development. In this study, we examine CpG island (CGI) sequences to predict whether a mouse blastocyst CGI inherits oocyte-derived DNA methylation from the maternal genome. Recurrent neural networks (RNNs), including that based on gated recurrent units (GRUs), have recently been employed for variable-length inputs in classification and regression analyses. One advantage of this strategy is the ability of RNNs to automatically learn latent features embedded in inputs by learning their model parameters. However, the available CGI dataset applied for the prediction of oocyte-derived DNA methylation inheritance are not large enough to train the neural networks.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We propose a GRU-based model called CMIC (CGI Methylation Inheritance Classifier) to augment CGI sequence by converting it into variable-length <jats:italic>k<\/jats:italic>-mers, where the length <jats:italic>k<\/jats:italic> is randomly selected from the range <jats:inline-formula><jats:alternatives><jats:tex-math>$$k_{\\min }$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                    <mml:msub>\n                      <mml:mi>k<\/mml:mi>\n                      <mml:mo>min<\/mml:mo>\n                    <\/mml:msub>\n                  <\/mml:math><\/jats:alternatives><\/jats:inline-formula> to <jats:inline-formula><jats:alternatives><jats:tex-math>$$k_{\\max }$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                    <mml:msub>\n                      <mml:mi>k<\/mml:mi>\n                      <mml:mo>max<\/mml:mo>\n                    <\/mml:msub>\n                  <\/mml:math><\/jats:alternatives><\/jats:inline-formula>, <jats:italic>N<\/jats:italic> times, which were then used as neural network input. <jats:italic>N<\/jats:italic> was set to 1000 in the default setting. In addition, we proposed a new embedding vector generator for <jats:italic>k<\/jats:italic>-mers called splitDNA2vec. The randomness of this procedure was higher than the previous work, dna2vec.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>We found that CMIC can predict the inheritance of oocyte-derived DNA methylation at CGIs in the maternal genome of blastocysts with a high F-measure (0.93). We also show that the F-measure can be improved by increasing the parameter <jats:italic>N<\/jats:italic>, that is, the number of sequences of variable-length <jats:italic>k<\/jats:italic>-mers derived from a single CGI sequence. This implies the effectiveness of augmenting input data by converting a DNA sequence to <jats:italic>N<\/jats:italic> sequences of variable-length <jats:italic>k<\/jats:italic>-mers. This approach can be applied to different DNA sequence classification and regression analyses, particularly those involving a small amount of data.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-022-04916-3","type":"journal-article","created":{"date-parts":[[2022,9,12]],"date-time":"2022-09-12T17:03:17Z","timestamp":1663002197000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["CMIC: predicting DNA methylation inheritance of CpG islands with embedding vectors of variable-length k-mers"],"prefix":"10.1186","volume":"23","author":[{"given":"Osamu","family":"Maruyama","sequence":"first","affiliation":[]},{"given":"Yinuo","family":"Li","sequence":"additional","affiliation":[]},{"given":"Hiroki","family":"Narita","sequence":"additional","affiliation":[]},{"given":"Hidehiro","family":"Toh","sequence":"additional","affiliation":[]},{"given":"Wan Kin","family":"Au Yeung","sequence":"additional","affiliation":[]},{"given":"Hiroyuki","family":"Sasaki","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,9,12]]},"reference":[{"key":"4916_CR1","doi-asserted-by":"publisher","first-page":"281","DOI":"10.1016\/j.ceb.2013.02.013","volume":"25","author":"S Seisenberger","year":"2013","unstructured":"Seisenberger S, Peat JR, Reik W. Conceptual links between DNA methylation reprogramming in the early embryo and primordial germ cells. Curr Opin Cell Biol. 2013;25:281\u20138.","journal-title":"Curr Opin Cell Biol"},{"key":"4916_CR2","doi-asserted-by":"publisher","first-page":"952","DOI":"10.1016\/j.cell.2019.01.043","volume":"176","author":"V Tucci","year":"2019","unstructured":"Tucci V, Isles AR, Kelsey G, Ferguson-Smith AC, Tucci V, Bartolomei MS, Benvenisty N, Bourc\u2019his D, Charalambous M, Dulac C, Feil R, Glaser J, Huelsmann L, John RM, McNamara GI, Moorwood K, Muscatelli F, Sasaki H, Strassmann BI, Vincenz C, Wilkins J, Isles AR, Kelsey G, Ferguson-Smith AC. Genomic imprinting and physiological processes in mammals. Cell. 2019;176:952\u201365.","journal-title":"Cell"},{"key":"4916_CR3","doi-asserted-by":"crossref","unstructured":"Lacal I, Ventura R. Epigenetic inheritance: Concepts, mechanisms and perspectives. Front Mol Neurosci. 2018;11","DOI":"10.3389\/fnmol.2018.00292"},{"key":"4916_CR4","doi-asserted-by":"publisher","first-page":"1010","DOI":"10.1101\/gad.2037511","volume":"25","author":"AM Deaton","year":"2011","unstructured":"Deaton AM, Bird A. CpG islands and the regulation of transcription. Gene Dev. 2011;25:1010\u201322.","journal-title":"Gene Dev"},{"key":"4916_CR5","doi-asserted-by":"publisher","first-page":"1607","DOI":"10.1101\/gad.1667008","volume":"22","author":"R Hirasawa","year":"2008","unstructured":"Hirasawa R, Chiba H, Kaneda M, Tajima S, Li E, Jaenisch R, Sasaki H. Maternal and zygotic Dnmt1 are necessary and sufficient for the maintenance of DNA methylation imprints during preimplantation development. Gene Dev. 2008;22:1607\u201316.","journal-title":"Gene Dev"},{"key":"4916_CR6","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.1007042","volume":"13","author":"S Maenohara","year":"2017","unstructured":"Maenohara S, Unoki M, Toh H, Ohishi H, Sharif J, Koseki H, Sasaki H. Role of UHRF1 in de novo DNA methylation in oocytes and maintenance methylation in preimplantation embryos. PLoS Genet. 2017;13: e1007042.","journal-title":"PLoS Genet"},{"key":"4916_CR7","doi-asserted-by":"publisher","first-page":"282","DOI":"10.1016\/j.celrep.2019.03.002","volume":"27","author":"WK Au Yeung","year":"2019","unstructured":"Au Yeung WK, Brind Amour J, Hatano Y, Yamagata K, Feil R, Lorincz MC, Tachibana M, Shinkai Y, Sasaki H. Histone H3K9 methyltransferase G9a in oocytes is essential for preimplantation development but dispensable for CG methylation protection. Cell Rep. 2019;27:282\u201393.","journal-title":"Cell Rep"},{"key":"4916_CR8","doi-asserted-by":"publisher","first-page":"811","DOI":"10.1038\/ng.864","volume":"43","author":"SA Smallwood","year":"2011","unstructured":"Smallwood SA, Tomizawa S-I, Krueger F, Ruf N, Carli N, Segonds-Pichon A, Sato S, Hata K, Andrews SR, Kelsey G. Dynamic CpG island methylation landscape in oocytes and preimplantation embryos. Nat Genet. 2011;43:811\u20134.","journal-title":"Nat Genet"},{"key":"4916_CR9","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1186\/s13059-015-0672-7","volume":"16","author":"R Strogantsev","year":"2015","unstructured":"Strogantsev R, Krueger F, Yamazawa K, Shi H, Gould P, Goldman-Roberts M, McEwen K, Sun B, Pedersen R, Ferguson-Smith AC. Allele-specific binding of ZFP57 in the epigenetic regulation of imprinted and non-imprinted monoallelic expression. Genome Biol. 2015;16:112.","journal-title":"Genome Biol"},{"key":"4916_CR10","doi-asserted-by":"publisher","first-page":"12253","DOI":"10.1073\/pnas.2037852100","volume":"100","author":"FA Feltus","year":"2003","unstructured":"Feltus FA, Lee EK, Costello JF, Plass C, Vertino PM. Predicting aberrant CpG island methylation. Proc Natl Acad Sci USA. 2003;100:12253\u20138.","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"4916_CR11","doi-asserted-by":"publisher","first-page":"572","DOI":"10.1016\/j.ygeno.2005.12.016","volume":"87","author":"FA Feltus","year":"2006","unstructured":"Feltus FA, Lee EK, Costello JF, Plass C, Vertino PM. DNA motifs associated with aberrant CpG island methylation. Genomics. 2006;87:572\u20139.","journal-title":"Genomics"},{"key":"4916_CR12","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.0020026","volume":"2","author":"C Bock","year":"2006","unstructured":"Bock C, Paulsen M, Tierling S, Mikeska T, Lengauer T, Walter J. CpG Island methylation in human lymphocytes is highly correlated with DNA sequence, repeats, and predicted DNA structure. PLoS Genet. 2006;2: e26.","journal-title":"PLoS Genet"},{"key":"4916_CR13","doi-asserted-by":"publisher","first-page":"2204","DOI":"10.1093\/bioinformatics\/btl377","volume":"22","author":"F Fang","year":"2006","unstructured":"Fang F, Fan S, Zhang X, Zhang MQ. Predicting methylation status of CpG islands in the human brain. Bioinformatics. 2006;22:2204\u20139.","journal-title":"Bioinformatics"},{"key":"4916_CR14","doi-asserted-by":"publisher","first-page":"S15","DOI":"10.1186\/1471-2105-13-S3-S15","volume":"13","author":"Y Yang","year":"2012","unstructured":"Yang Y, Nephew K, Kim S. A novel K-Mer mixture logistic regression for methylation susceptibility modeling of CpG dinucleotides in human gene promoters. BMC Bioinform. 2012;13:S15.","journal-title":"BMC Bioinform"},{"issue":"Suppl 1","key":"4916_CR15","doi-asserted-by":"publisher","first-page":"S13","DOI":"10.1186\/1755-8794-6-S1-S13","volume":"6","author":"H Zheng","year":"2013","unstructured":"Zheng H, Wu H, Li J, Jiang S-W. CpGIMethPred: computational model for predicting methylation status of CpG islands in human genome. BMC Med Genomics. 2013;6(Suppl 1):S13.","journal-title":"BMC Med Genomics"},{"key":"4916_CR16","doi-asserted-by":"publisher","first-page":"179","DOI":"10.2174\/1574893615999200724145835","volume":"16","author":"D Yalcin","year":"2021","unstructured":"Yalcin D, Otu HH. An unbiased predictive model to detect DNA methylation propensity of CpG Islands in the human genome. Curr Bioinform. 2021;16:179\u201396.","journal-title":"Curr Bioinform"},{"key":"4916_CR17","first-page":"1","volume":"8","author":"Z Shen","year":"2018","unstructured":"Shen Z, Bao W, Huang D-S. Recurrent neural network for predicting transcription factor binding sites. Sci Rep. 2018;8:1\u201310.","journal-title":"Sci Rep"},{"key":"4916_CR18","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkw226","volume":"44","author":"D Quang","year":"2016","unstructured":"Quang D, Xie X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res. 2016;44: e107.","journal-title":"Nucleic Acids Res"},{"key":"4916_CR19","doi-asserted-by":"crossref","first-page":"i92","DOI":"10.1093\/bioinformatics\/btx234","volume":"33","author":"X Min","year":"2017","unstructured":"Min X, Zeng W, Chen N, Chen T, Jiang R. Chromatin accessibility prediction via convolutional long short-term memory networks with $$k$$-mer embedding. Bioinformatics. 2017;33:i92\u2013101.","journal-title":"Bioinformatics"},{"key":"4916_CR20","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1002\/aris.1440370103","volume":"37","author":"GG Chowdhury","year":"2003","unstructured":"Chowdhury GG. Natural language processing. Annu Rev Inform Sci. 2003;37:51\u201389.","journal-title":"Annu Rev Inform Sci"},{"issue":"245","key":"4916_CR21","first-page":"1","volume":"21","author":"S Chen","year":"2020","unstructured":"Chen S, Dobriban E, Lee JH. A group-theoretic framework for data augmentation. J Mach Learn Res. 2020;21(245):1\u201371.","journal-title":"J Mach Learn Res"},{"key":"4916_CR22","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15:1929\u201358.","journal-title":"J Mach Learn Res"},{"key":"4916_CR23","unstructured":"Kingma DP, Ba J. Adam. A method for stochastic optimization. Preprint arXiv:1412.6980. 2014."},{"key":"4916_CR24","unstructured":"Ng P. dna2vec: Consistent vector representations of variable-length k-mers. 2017."},{"key":"4916_CR25","doi-asserted-by":"crossref","unstructured":"Cho K, Van\u00a0Merri\u00ebnboer B, Bahdanau D, Bengio Y. On the properties of neural machine translation: Encoder-decoder approaches. Preprint arXiv:1409.1259. 2014.","DOI":"10.3115\/v1\/W14-4012"},{"key":"4916_CR26","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9:1735\u201380.","journal-title":"Neural Comput"},{"key":"4916_CR27","doi-asserted-by":"publisher","first-page":"2673","DOI":"10.1109\/78.650093","volume":"45","author":"M Schuster","year":"1997","unstructured":"Schuster M, Paliwal KK. Bidirectional recurrent neural networks. IEEE T Signal Proces. 1997;45:2673\u201381.","journal-title":"IEEE T Signal Proces"},{"key":"4916_CR28","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.1003439","volume":"9","author":"K Shirane","year":"2013","unstructured":"Shirane K, Toh H, Kobayashi H, Miura F, Chiba H, Ito T, Kono T, Sasaki H. Mouse oocyte methylomes at base resolution reveal genome-wide accumulation of non-CpG methylation and role of DNA methyltransferases. PLoS Genet. 2013;9: e1003439.","journal-title":"PLoS Genet"},{"key":"4916_CR29","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.1009570","volume":"17","author":"K Kibe","year":"2021","unstructured":"Kibe K, Shirane K, Ohishi H, Uemura S, Toh H, Sasaki H. The DNMT3A PWWP domain is essential for the normal DNA methylation landscape in mouse somatic cells and oocytes. PLoS Genet. 2021;17: e1009570.","journal-title":"PLoS Genet"},{"key":"4916_CR30","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1093\/nar\/gkg129","volume":"31","author":"D Karolchik","year":"2003","unstructured":"Karolchik D, Baertsch R, Diekhans M, Furey TS, Hinrichs A, Lu Y, Roskin KM, Schwartz M, Sugnet CW, Thomas DJ, et al. The UCSC genome browser database. Nucleic Acids Res. 2003;31:51\u20134.","journal-title":"Nucleic Acids Res"},{"key":"4916_CR31","doi-asserted-by":"publisher","first-page":"1571","DOI":"10.1093\/bioinformatics\/btr167","volume":"27","author":"F Krueger","year":"2011","unstructured":"Krueger F, Andrews SR. Bismark: a flexible aligner and methylation caller for bisulfite-seq applications. Bioinformatics. 2011;27:1571\u20132.","journal-title":"Bioinformatics"},{"key":"4916_CR32","doi-asserted-by":"publisher","first-page":"1329","DOI":"10.1101\/gr.156497.113","volume":"23","author":"T Takada","year":"2013","unstructured":"Takada T, Ebata T, Noguchi H, Keane TM, Adams DJ, Narita T, Shin T, Fujisawa H, Toyoda A, Abe K, et al. The ancestor of extant Japanese fancy mice contributed to the mosaic genomes of classical inbred strains. Genome Res. 2013;23:1329\u201338.","journal-title":"Genome Res"},{"key":"4916_CR33","doi-asserted-by":"publisher","first-page":"1479","DOI":"10.12688\/f1000research.9037.1","volume":"5","author":"F Krueger","year":"2016","unstructured":"Krueger F, Andrews SR. SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes. F1000Research. 2016;5:1479.","journal-title":"F1000Research"},{"key":"4916_CR34","doi-asserted-by":"publisher","first-page":"15270","DOI":"10.1038\/s41598-018-33321-1","volume":"8","author":"Z Shen","year":"2018","unstructured":"Shen Z, Bao W, Huang D-S. Recurrent neural network for predicting transcription factor binding sites. Sci Rep. 2018;8:15270.","journal-title":"Sci Rep"},{"key":"4916_CR35","doi-asserted-by":"crossref","unstructured":"Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science. 1985.","DOI":"10.21236\/ADA164453"},{"key":"4916_CR36","first-page":"2579","volume":"9","author":"L van der Maaten","year":"2008","unstructured":"van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9:2579\u2013605.","journal-title":"J Mach Learn Res"},{"key":"4916_CR37","doi-asserted-by":"publisher","unstructured":"Yamada Y, Watanabe H, Miura F, Soejima H, Uchiyama M, Iwasaka T, Mukai T, Sakaki Y, Ito T. A comprehensive analysis of allelic methylation status of CpG islands on human chromosome 21q. Genome Res. 2004;14(2):247-66. https:\/\/doi.org\/10.1101\/gr.1351604.","DOI":"10.1101\/gr.1351604"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-04916-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-022-04916-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-04916-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,13]],"date-time":"2022-09-13T05:09:29Z","timestamp":1663045769000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-022-04916-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,12]]},"references-count":37,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["4916"],"URL":"https:\/\/doi.org\/10.1186\/s12859-022-04916-3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2022,9,12]]},"assertion":[{"value":"10 September 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 September 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 September 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"All animal experiments were performed under the ethical guidelines of Kyushu University, and the protocols were approved by the Institutional Animal Care and Use Committee. All mice used in this study were euthanized by carbon dioxide asphyxiation. This study is reported in accordance with ARRIVE guidelines.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}],"article-number":"371"}}