{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T11:07:42Z","timestamp":1768561662156,"version":"3.49.0"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T00:00:00Z","timestamp":1619136000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T00:00:00Z","timestamp":1619136000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100006792","name":"Hartwell Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006792","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Bio-X Center"},{"name":"Precision Health and Integrated Diagnostics Center"},{"DOI":"10.13039\/100000092","name":"U.S. National Library of Medicine","doi-asserted-by":"publisher","award":["5 T32 LM012409-03"],"award-info":[{"award-number":["5 T32 LM012409-03"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BioData Mining"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows us to observe a fraction of sequencing errors as Mendelian errors in the family, which we can then use to produce genome-wide error estimates for each sample.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We introduce a method that uses Mendelian errors in sequencing data to make highly granular per-sample estimates of precision and recall for any set of variant calls, regardless of sequencing platform or calling methodology. We validate the accuracy of our estimates using monozygotic twins, and we use a set of monozygotic quadruplets to show that our predictions closely match the consensus method. We demonstrate our method\u2019s versatility by estimating sequencing error rates for whole genome sequencing, whole exome sequencing, and microarray datasets, and we highlight its sensitivity by quantifying performance increases between different versions of the GATK variant-calling pipeline. We then use our method to demonstrate that: 1) Sequencing error rates between samples in the same dataset can vary by over an order of magnitude. 2) Variant calling performance decreases substantially in low-complexity regions of the genome. 3) Variant calling performance in whole exome sequencing data decreases with distance from the nearest target region. 4) Variant calls from lymphoblastoid cell lines can be as accurate as those from whole blood. 5) Whole-genome sequencing can attain microarray-level precision and recall at disease-associated SNV sites.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>Genotype datasets from families are powerful resources that can be used to make fine-grained estimates of sequencing error for any sequencing platform and variant-calling methodology.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s13040-021-00259-6","type":"journal-article","created":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T10:04:14Z","timestamp":1619172254000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Estimating sequencing error rates using families"],"prefix":"10.1186","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5252-1401","authenticated-orcid":false,"given":"Kelley","family":"Paskov","sequence":"first","affiliation":[]},{"given":"Jae-Yoon","family":"Jung","sequence":"additional","affiliation":[]},{"given":"Brianna","family":"Chrisman","sequence":"additional","affiliation":[]},{"given":"Nate T.","family":"Stockham","sequence":"additional","affiliation":[]},{"given":"Peter","family":"Washington","sequence":"additional","affiliation":[]},{"given":"Maya","family":"Varma","sequence":"additional","affiliation":[]},{"given":"Min Woo","family":"Sun","sequence":"additional","affiliation":[]},{"given":"Dennis P.","family":"Wall","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,4,23]]},"reference":[{"issue":"335","key":"259_CR1","doi-asserted-by":"publisher","first-page":"335ps10","DOI":"10.1126\/scitranslmed.aaf7314","volume":"8","author":"RB Altman","year":"2016","unstructured":"Altman RB, Prabhu S, Sidow A, Zook JM, Goldfeder R, Litwack D, Ashley E, Asimenos G, Bustamante CD, Donigan K, Giacomini KM. A research roadmap for next-generation sequencing informatics. Sci Transl Med. 2016; 8(335):335ps10-.","journal-title":"Sci Transl Med"},{"issue":"1","key":"259_CR2","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1038\/nbt.2065","volume":"30","author":"HYK Lam","year":"2012","unstructured":"Lam HYK, Clark MJ, Chen R, Chen R, Natsoulis G, O\u2019Huallachain M, Dewey FE, Habegger L, Ashley EA, Gerstein MB, Butte AJ, Ji HP, Snyder M. Performance comparison of whole-genome sequencing platforms. Nat Biotechnol. 2012; 30(1):78\u201382. https:\/\/doi.org\/10.1038\/nbt.2065.","journal-title":"Nat Biotechnol"},{"issue":"1","key":"259_CR3","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1002\/0471250953.bi1110s43","volume":"43","author":"GA Van der Auwera","year":"2013","unstructured":"Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, Banks E, Garimella KV, Altshuler D, Gabriel S, DePristo MA. From fastQ data to high-confidence variant calls: The genome analysis toolkit best practices pipeline. Curr Protocol Bioinforma. 2013; 43(1):11\u201310. https:\/\/doi.org\/10.1002\/0471250953.bi1110s43.","journal-title":"Curr Protocol Bioinforma"},{"issue":"1","key":"259_CR4","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1038\/nrg3655","volume":"15","author":"K Robasky","year":"2014","unstructured":"Robasky K, Lewis NE, Church GM. The role of replicates for error mitigation in next-generation sequencing. Nat Rev Genet. 2014; 15(1):56\u201362.","journal-title":"Nat Rev Genet"},{"issue":"3","key":"259_CR5","doi-asserted-by":"publisher","first-page":"246","DOI":"10.1038\/nbt.2835","volume":"32","author":"JM Zook","year":"2014","unstructured":"Zook JM, Chapman B, Wang J, Mittelman D, Hofmann O, Hide W, Salit M. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat Biotechnol. 2014; 32(3):246\u201351. https:\/\/doi.org\/10.1038\/nbt.2835.","journal-title":"Nat Biotechnol"},{"issue":"1","key":"259_CR6","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1101\/gr.210500.116","volume":"27","author":"MA Eberle","year":"2017","unstructured":"Eberle MA, Fritzilas E, Krusche P, K\u00e4llberg M, Moore BL, Bekritsky MA, Iqbal Z, Chuang HY, Humphray SJ, Halpern AL, Kruglyak S, Margulies EH, McVean G, Bentley DR. A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 2017; 27(1):157\u201364. https:\/\/doi.org\/10.1101\/gr.210500.116.","journal-title":"Genome Res"},{"issue":"3","key":"259_CR7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/gm432","volume":"5","author":"J O\u2019Rawe","year":"2013","unstructured":"O\u2019Rawe J, Jiang T, Sun G, Wu Y, Wang W, Hu J, Bodily P, Tian L, Hakonarson H, Johnson WE, Wei Z, Wang K, Lyon GJ. Low concordance of multiple variant-calling pipelines: Practical implications for exome and genome sequencing. Genome Med. 2013; 5(3):1\u201318. https:\/\/doi.org\/10.1186\/gm432.","journal-title":"Genome Med"},{"issue":"1","key":"259_CR8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-14-184","volume":"14","author":"A Hatem","year":"2013","unstructured":"Hatem A, Bozda\u011f D, Toland AE, \u00c7ataly\u00fcrek \u00dcV. Benchmarking short sequence mapping tools. BMC Bioinformatics. 2013; 14(1):1\u201325.","journal-title":"BMC Bioinformatics"},{"issue":"24","key":"259_CR9","doi-asserted-by":"publisher","first-page":"3169","DOI":"10.1093\/bioinformatics\/bts605","volume":"28","author":"NA Fonseca","year":"2012","unstructured":"Fonseca NA, Rung J, Brazma A, Marioni JC. Tools for mapping high-throughput sequencing data. Bioinformatics. 2012; 28(24):3169\u201377. https:\/\/doi.org\/10.1093\/bioinformatics\/bts605.","journal-title":"Bioinformatics"},{"issue":"1","key":"259_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-14-274","volume":"14","author":"X Yu","year":"2013","unstructured":"Yu X, Sun S. Comparing a few SNP calling algorithms using low-coverage sequencing data. BMC Bioinformatics. 2013; 14(1):1\u201315. https:\/\/doi.org\/10.1186\/1471-2105-14-274.","journal-title":"BMC Bioinformatics"},{"key":"259_CR11","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1186\/1479-7364-8-14","volume":"8","author":"M Pirooznia","year":"2014","unstructured":"Pirooznia M, Kramer M, Parla J, Goes FS, Potash JB, McCombie WR, Zandi PP. Validation and assessment of variant calling pipelines for next-generation sequencing. Hum Genomics. 2014; 8:14. https:\/\/doi.org\/10.1186\/1479-7364-8-14.","journal-title":"Hum Genomics"},{"key":"259_CR12","doi-asserted-by":"publisher","unstructured":"Brandt DYC, Aguiar VRC, Bitarello BD, Nunes K, Goudet J, Meyer D. Mapping bias overestimates reference allele frequencies at the HLA genes in the 1000 genomes project phase I data. G3: Genes, Genomes, Genetics. 2015. https:\/\/doi.org\/10.1534\/g3.114.015784.","DOI":"10.1534\/g3.114.015784"},{"issue":"1","key":"259_CR13","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1038\/jhg.2012.123","volume":"58","author":"JH Oh","year":"2013","unstructured":"Oh JH, Kim YJ, Moon S, Nam HY, Jeon JP, Ho Lee J, Lee JY, Cho YS. Genotype instability during long-term subculture of lymphoblastoid cell lines. J Hum Genet. 2013; 58(1):16\u201320. https:\/\/doi.org\/10.1038\/jhg.2012.123.","journal-title":"J Hum Genet"},{"issue":"12","key":"259_CR14","doi-asserted-by":"publisher","first-page":"e0144162","DOI":"10.1371\/journal.pone.0144162","volume":"10","author":"E Oh","year":"2015","unstructured":"Oh E, Choi YL, Kwon MJ, Kim RN, Kim YJ, Song JY, Jung KS, Shin YK. Comparison of accuracy of whole-exome sequencing with formalin-fixed paraffin-embedded and fresh frozen tissue samples. PLoS ONE. 2015; 10(12):e0144162. https:\/\/doi.org\/10.1371\/journal.pone.0144162.","journal-title":"PLoS ONE"},{"issue":"1","key":"259_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-016-1029-6","volume":"17","author":"Y Fan","year":"2016","unstructured":"Fan Y, Xi L, Hughes DST, Zhang J, Zhang J, Futreal PA, Wheeler DA, Wang W. MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data. Genome Biol. 2016; 17(1):1\u201311. https:\/\/doi.org\/10.1186\/s13059-016-1029-6.","journal-title":"Genome Biol"},{"issue":"2","key":"259_CR16","doi-asserted-by":"publisher","first-page":"487","DOI":"10.1086\/338919","volume":"70","author":"JA Douglas","year":"2002","unstructured":"Douglas JA, Skol AD, Boehnke M. Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data. Am J Hum Genet. 2002; 70(2):487\u201395. https:\/\/doi.org\/10.1086\/338919.","journal-title":"Am J Hum Genet"},{"key":"259_CR17","doi-asserted-by":"publisher","first-page":"16","DOI":"10.3389\/fgene.2014.00016","volume":"5","author":"ZH Patel","year":"2014","unstructured":"Patel ZH, Kottyan LC, Lazaro S, Williams MS, Ledbetter DH, Tromp G, Rupert A, Kohram M, Wagner M, Husami A, Qian Y, Valencia CA, Zhang K, Hostetter MK, Harley JB, Kaufman KM. The struggle to find reliable results in exome sequencing data: Filtering out Mendelian errors. Front Genet. 2014; 5:16. https:\/\/doi.org\/10.3389\/fgene.2014.00016.","journal-title":"Front Genet"},{"issue":"2","key":"259_CR18","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1016\/j.ygeno.2017.01.005","volume":"109","author":"Y Guo","year":"2017","unstructured":"Guo Y, Dai Y, Yu H, Zhao S, Samuels DC, Shyr Y. Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis. Genomics. 2017; 109(2):83\u201390. https:\/\/doi.org\/10.1016\/j.ygeno.2017.01.005.","journal-title":"Genomics"},{"key":"259_CR19","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1007\/978-1-4939-7471-9_6","volume":"1706","author":"JK DiStefano","year":"2018","unstructured":"DiStefano JK. The emerging role of long noncoding rnas in human disease. Methods Mol Biol. 2018; 1706:91\u2013110. https:\/\/doi.org\/10.1007\/978-1-4939-7471-96.","journal-title":"Methods Mol Biol"},{"issue":"20","key":"259_CR20","doi-asserted-by":"publisher","first-page":"2843","DOI":"10.1093\/bioinformatics\/btu356","volume":"30","author":"H Li","year":"2014","unstructured":"Li H, Wren J. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics. 2014; 30(20):2843\u201351. https:\/\/doi.org\/10.1093\/bioinformatics\/btu356.","journal-title":"Bioinformatics"},{"issue":"1","key":"259_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2164-13-1","volume":"13","author":"Y Guo","year":"2012","unstructured":"Guo Y, Long J, He J, Li CI, Cai Q, Shu XO, Zheng W, Li C. Exome sequencing generates high quality data in non-target regions. BMC Genomics. 2012; 13(1):1\u201310. https:\/\/doi.org\/10.1186\/1471-2164-13-194.","journal-title":"BMC Genomics"},{"key":"259_CR22","doi-asserted-by":"publisher","unstructured":"Dou J, Wu D, Ding L, Wang K, Jiang M, Chai X, Reilly DF, Tai ES, Liu J, Sim X, Cheng S, Wang C. Using off-target data from whole-exome sequencing to improve genotyping accuracy, association analysis and polygenic risk prediction. Briefings in Bioinformatics. 2020. https:\/\/doi.org\/10.1093\/bib\/bbaa084.","DOI":"10.1093\/bib\/bbaa084"},{"key":"259_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.2147\/AGG.S128824","volume":"7","author":"L Joesch-Cohen","year":"2017","unstructured":"Joesch-Cohen L, Glusman G. Differences between the genomes of lymphoblastoid cell lines and blood-derived samples. Adv Genomics Genet. 2017; 7:1. https:\/\/doi.org\/10.2147\/agg.s128824.","journal-title":"Adv Genomics Genet"},{"issue":"1","key":"259_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13073-016-0269-0","volume":"8","author":"RL Goldfeder","year":"2016","unstructured":"Goldfeder RL, Priest JR, Zook JM, Grove ME, Waggott D, Wheeler MT, Salit M, Ashley EA. Medical implications of technical accuracy in genome sequencing. Genome Med. 2016; 8(1):1\u201312. https:\/\/doi.org\/10.1186\/s13073-016-0269-0.","journal-title":"Genome Med"},{"issue":"11","key":"259_CR25","doi-asserted-by":"publisher","first-page":"1734","DOI":"10.1101\/gr.168393.113","volume":"24","author":"JD Wall","year":"2014","unstructured":"Wall JD, Tang LF, Zerbe B, Kvale MN, Kwok PY, Schaefer C, Risch N. Estimating genotype error rates from high-coverage next-generation sequence data. Genome Res. 2014; 24(11):1734\u20139. https:\/\/doi.org\/10.1101\/gr.168393.113.","journal-title":"Genome Res"},{"issue":"1","key":"259_CR26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2164-12-464","volume":"12","author":"ER Londin","year":"2011","unstructured":"Londin ER, Keller MA, D\u2019Andrea MR, Delgrosso K, Ertel A, Surrey S, Fortina P. Whole-exome sequencing of DNA from peripheral blood mononuclear cells (PBMC) and EBV-transformed lymphocytes from the same donor. BMC Genomics. 2011; 12(1):1\u20139. https:\/\/doi.org\/10.1186\/1471-2164-12-464.","journal-title":"BMC Genomics"},{"issue":"1","key":"259_CR27","doi-asserted-by":"publisher","first-page":"115","DOI":"10.2307\/3314676","volume":"8","author":"DR McDonald","year":"1980","unstructured":"McDonald DR. On the Poisson approximation to the multinomial distribution. Can J Stat. 1980; 8(1):115\u20138.","journal-title":"Can J Stat"},{"issue":"4","key":"259_CR28","doi-asserted-by":"publisher","first-page":"850","DOI":"10.1016\/j.cell.2019.07.015","volume":"178","author":"EK Ruzzo","year":"2019","unstructured":"Ruzzo EK, P\u00e9rez-Cano L, Jung JY, Wang L. k., Kashef-Haghighi D, Hartl C, Singh C, Xu J, Hoekstra JN, Leventhal O, Lepp\u00e4 VM, Gandal MJ, Paskov K, Stockham N, Polioudakis D, Lowe JK, Prober DA, Geschwind DH, Wall DP. Inherited and de novo genetic risk for autism impacts shared networks. Cell. 2019; 178(4):850\u201366. https:\/\/doi.org\/10.1016\/j.cell.2019.07.015.","journal-title":"Cell"},{"issue":"3","key":"259_CR29","doi-asserted-by":"publisher","first-page":"710","DOI":"10.1016\/j.cell.2017.08.047","volume":"171","author":"TN Turner","year":"2017","unstructured":"Turner TN, Coe BP, Dickel DE, Hoekzema K, Nelson BJ, Zody MC, Kronenberg ZN, Hormozdiari F, Raja A, Pennacchio LA, Darnell RB, Eichler EE. Genomic patterns of de novo mutation in simplex autism. Cell. 2017; 171(3):710\u201322. https:\/\/doi.org\/10.1016\/j.cell.2017.08.047.","journal-title":"Cell"},{"issue":"1","key":"259_CR30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41525-019-0093-8","volume":"4","author":"P Feliciano","year":"2019","unstructured":"Feliciano P, Zhou X, Astrovskaya I, Turner TN, Wang T, Brueggeman L, Barnard R, Hsieh A, Snyder LG, Muzny DM, Sabo A. Exome sequencing of 457 autism families recruited online provides evidence for autism risk genes. NPJ Genomic Med. 2019; 4(1):1\u20134.","journal-title":"NPJ Genomic Med"},{"issue":"3","key":"259_CR31","doi-asserted-by":"publisher","first-page":"488","DOI":"10.1016\/j.neuron.2018.01.015","volume":"97","author":"P Feliciano","year":"2018","unstructured":"Feliciano P, Daniels AM, Snyder LG, Beaumont A, Camba A, Esler A, Gulsrud AG, Mason A, Gutierrez A, Nicholson A, Paolicelli AM. SPARK: a US cohort of 50,000 families to accelerate autism research. Neuron. 2018; 97(3):488\u201393.","journal-title":"Neuron"},{"issue":"7412","key":"259_CR32","doi-asserted-by":"publisher","first-page":"471","DOI":"10.1038\/nature11396","volume":"488","author":"A Kong","year":"2012","unstructured":"Kong A, Frigge ML, Masson G, Besenbacher S, Sulem P, Magnusson G, Gudjonsson SA, Sigurdsson A, Jonasdottir A, Jonasdottir A, Wong WSW, Sigurdsson G, Walters GB, Steinberg S, Helgason H, Thorleifsson G, Gudbjartsson DF, Helgason A, Magnusson OT, Thorsteinsdottir U, Stefansson K. Rate of de novo mutations and the importance of father-s age to disease risk. Nature. 2012; 488(7412):471\u20135. https:\/\/doi.org\/10.1038\/nature11396.","journal-title":"Nature"},{"issue":"D1","key":"259_CR33","doi-asserted-by":"publisher","first-page":"D1005","DOI":"10.1093\/nar\/gky1120","volume":"47","author":"A Buniello","year":"2019","unstructured":"Buniello A, Macarthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, McMahon A, Morales J, Mountjoy E, Sollis E, Suveges D, Vrousgou O, Whetzel PL, Amode R, Guillen JA, Riat HS, Trevanion SJ, Hall P, Junkins H, Flicek P, Burdett T, Hindorff LA, Cunningham F, Parkinson H. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019; 47(D1):D1005\u201312. https:\/\/doi.org\/10.1093\/nar\/gky1120.","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"259_CR34","doi-asserted-by":"publisher","first-page":"996","DOI":"10.1101\/gr.229102","volume":"12","author":"WJ Kent","year":"2002","unstructured":"Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler a. D.The human genome browser at UCSC. Genome Res. 2002; 12(6):996\u20131006. https:\/\/doi.org\/10.1101\/gr.229102.","journal-title":"Genome Res"}],"container-title":["BioData Mining"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13040-021-00259-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13040-021-00259-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13040-021-00259-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,4,24]],"date-time":"2021-04-24T02:03:55Z","timestamp":1619229835000},"score":1,"resource":{"primary":{"URL":"https:\/\/biodatamining.biomedcentral.com\/articles\/10.1186\/s13040-021-00259-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,23]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["259"],"URL":"https:\/\/doi.org\/10.1186\/s13040-021-00259-6","relation":{},"ISSN":["1756-0381"],"issn-type":[{"value":"1756-0381","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,23]]},"assertion":[{"value":"26 December 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 March 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 April 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"All data used in this study is publicly available.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"27"}}