{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T12:34:46Z","timestamp":1721219686776},"reference-count":27,"publisher":"Springer Science and Business Media LLC","issue":"S2","license":[{"start":{"date-parts":[[2020,3,1]],"date-time":"2020-03-01T00:00:00Z","timestamp":1583020800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,3,11]],"date-time":"2020-03-11T00:00:00Z","timestamp":1583884800000},"content-version":"vor","delay-in-days":10,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Genomic micro-satellites are the genomic regions that consist of short and repetitive DNA motifs. Estimating the length distribution and state of a micro-satellite region is an important computational step in cancer sequencing data pipelines, which is suggested to facilitate the downstream analysis and clinical decision supporting. Although several state-of-the-art approaches have been proposed to identify micro-satellite instability (MSI) events, they are limited in dealing with regions longer than one read length. Moreover, based on our best knowledge, all of these approaches imply a hypothesis that the tumor purity of the sequenced samples is sufficiently high, which is inconsistent with the reality, leading the inferred length distribution to dilute the data signal and introducing the false positive errors.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>In this article, we proposed a computational approach, named <jats:italic>ELMSI<\/jats:italic>, which detected MSI events based on the next generation sequencing technology. <jats:italic>ELMSI<\/jats:italic> can estimate the specific length distributions and states of micro-satellite regions from a mixed tumor sample paired with a control one. It first estimated the purity of the tumor sample based on the read counts of the filtered SNVs loci. Then, the algorithm identified the length distributions and the states of short micro-satellites by adding the Maximum Likelihood Estimation (MLE) step to the existing algorithm. After that, <jats:italic>ELMSI<\/jats:italic> continued to infer the length distributions of long micro-satellites by incorporating a simplified Expectation Maximization (EM) algorithm with central limit theorem, and then used statistical tests to output the states of these micro-satellites. Based on our experimental results, <jats:italic>ELMSI<\/jats:italic> was able to handle micro-satellites with lengths ranging from shorter than one read length to 10kbps.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>To verify the reliability of our algorithm, we first compared the ability of classifying the shorter micro-satellites from the mixed samples with the existing algorithm <jats:italic>MSIsensor<\/jats:italic>. Meanwhile, we varied the number of micro-satellite regions, the read length and the sequencing coverage to separately test the performance of <jats:italic>ELMSI<\/jats:italic> on estimating the longer ones from the mixed samples. <jats:italic>ELMSI<\/jats:italic> performed well on mixed samples, and thus <jats:italic>ELMSI<\/jats:italic> was of great value for improving the recognition effect of micro-satellite regions and supporting clinical decision supporting. The source codes have been uploaded and maintained at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/YixuanWang1120\/ELMSI\">https:\/\/github.com\/YixuanWang1120\/ELMSI<\/jats:ext-link>\nfor academic use only.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-020-3349-5","type":"journal-article","created":{"date-parts":[[2020,3,13]],"date-time":"2020-03-13T02:02:49Z","timestamp":1584064969000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Accurately estimating the length distributions of genomic micro-satellites by tumor purity deconvolution"],"prefix":"10.1186","volume":"21","author":[{"given":"Yixuan","family":"Wang","sequence":"first","affiliation":[]},{"given":"Xuanping","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Xiao","family":"Xiao","sequence":"additional","affiliation":[]},{"given":"Fei-Ran","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Xinxing","family":"Yan","sequence":"additional","affiliation":[]},{"given":"Xuan","family":"Feng","sequence":"additional","affiliation":[]},{"given":"Zhongmeng","family":"Zhao","sequence":"additional","affiliation":[]},{"given":"Yanfang","family":"Guan","sequence":"additional","affiliation":[]},{"given":"Jiayin","family":"Wang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,3,11]]},"reference":[{"issue":"1367","key":"3349_CR1","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1098\/rspb.1996.0033","volume":"263","author":"D Field","year":"1996","unstructured":"Field D, Wills C. Long, polymorphic microsatellites in simple organisms. Proc Biol Sci. 1996; 263(1367):209.","journal-title":"Proc Biol Sci"},{"issue":"7","key":"3349_CR2","doi-asserted-by":"publisher","first-page":"967","DOI":"10.1101\/gr.10.7.967","volume":"10","author":"G T\u00f3th","year":"2000","unstructured":"T\u00f3th G, G\u00e1sp\u00e1ri Z, Jurka J. Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 2000; 10(7):967.","journal-title":"Genome Res"},{"issue":"6","key":"3349_CR3","doi-asserted-by":"publisher","first-page":"435","DOI":"10.1038\/nrg1348","volume":"5","author":"H Ellegren","year":"2004","unstructured":"Ellegren H. Microsatellites: simple sequences with complex evolution. Nat Rev Genet. 2004; 5(6):435\u201345.","journal-title":"Nat Rev Genet"},{"issue":"9","key":"3349_CR4","doi-asserted-by":"publisher","first-page":"1698","DOI":"10.1002\/elps.11501601282","volume":"16","author":"H Hummerich","year":"1995","unstructured":"Hummerich H, Lehrach H. Trinucleotide repeat expansion and human disease. Electrophoresis. 1995; 16(9):1698\u2013704.","journal-title":"Electrophoresis"},{"issue":"5","key":"3349_CR5","doi-asserted-by":"publisher","first-page":"352","DOI":"10.1053\/j.semdp.2015.02.018","volume":"32","author":"J Shia","year":"2015","unstructured":"Shia J. Evolving approach and clinical significance of detecting dna mismatch repair deficiency in colorectal carcinoma. Semin Diagn Pathol. 2015; 32(5):352\u201361.","journal-title":"Semin Diagn Pathol"},{"issue":"4","key":"3349_CR6","doi-asserted-by":"publisher","first-page":"858","DOI":"10.1016\/j.cell.2013.10.015","volume":"155","author":"TM Kim","year":"2013","unstructured":"Kim TM, Laird PW, Park PJ. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell. 2013; 155(4):858\u201368.","journal-title":"Cell"},{"issue":"15","key":"3349_CR7","doi-asserted-by":"publisher","first-page":"2525","DOI":"10.1038\/sj.onc.1208456","volume":"24","author":"SM Woerner","year":"2005","unstructured":"Woerner SM, Kloor M, Mueller A, Rueschoff J, Friedrichs N, Buettner R, Buzello M, Kienle P, Knaebel HP, Kunstmann E. Microsatellite instability of selective target genes in hnpcc-associated colon adenomas. Oncogene. 2005; 24(15):2525\u201335.","journal-title":"Oncogene"},{"key":"3349_CR8","doi-asserted-by":"publisher","first-page":"4988","DOI":"10.1038\/ncomms5988","volume":"5","author":"CC Pritchard","year":"2014","unstructured":"Pritchard CC, Morrissey C, Kumar A, Zhang X, Smith C, Coleman I, Salipante SJ, Milbank J, Yu M, Grady WM. Complex MSH2 and MSH6 mutations in hypermutated microsatellite unstable advanced prostate cancer. Nat Commun. 2014; 5:4988.","journal-title":"Nat Commun"},{"issue":"5","key":"3349_CR9","doi-asserted-by":"publisher","first-page":"502","DOI":"10.1158\/2159-8290.CD-12-0471","volume":"3","author":"E Vilar","year":"2013","unstructured":"Vilar E, Tabernero J. Molecular dissection of microsatellite instable colorectal cancer. Cancer Discov. 2013; 3(5):502\u201311.","journal-title":"Cancer Discov"},{"issue":"11","key":"3349_CR10","first-page":"21138","volume":"8","author":"B Li","year":"2015","unstructured":"Li B, Liu HY, Guo SH, Sun P, Gong FM, Jia BQ. Microsatellite instability of gastric cancer and precancerous lesions. Int J Clin Exp Med. 2015; 8(11):21138\u201344.","journal-title":"Int J Clin Exp Med"},{"issue":"4","key":"3349_CR11","first-page":"1387","volume":"9","author":"C Shannon","year":"2003","unstructured":"Shannon C, Kirk J, Barnetson R, Evans J, Schnitzler M, Quinn M, Hacker N, Crandon A, Harnett P. Incidence of microsatellite instability in synchronous tumors of the ovary and endometrium. Clin Cancer Res. 2003; 9(4):1387\u201392.","journal-title":"Clin Cancer Res"},{"issue":"3","key":"3349_CR12","doi-asserted-by":"publisher","first-page":"247","DOI":"10.1056\/NEJMoa022289","volume":"349","author":"CG Moertel","year":"2003","unstructured":"Moertel CG. Tumor microsatellite-instability status as a predictor of benefit from fluorouracil-based adjuvant chemotherapy for colon cancer. N Engl J Med. 2003; 349(3):247\u201357.","journal-title":"N Engl J Med"},{"issue":"4-5","key":"3349_CR13","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1155\/2004\/368680","volume":"20","author":"TM Pawlik","year":"2013","unstructured":"Pawlik TM, Raut CP, Rodriguezbigas MA. Colorectal carcinogenesis: Msi-h versus msi-l. Dis Markers. 2013; 20(4-5):199\u2013206.","journal-title":"Dis Markers"},{"issue":"2","key":"3349_CR14","doi-asserted-by":"publisher","first-page":"142","DOI":"10.6004\/jnccn.2017.0016","volume":"15","author":"J Gong","year":"2017","unstructured":"Gong J, Wang C, Lee PP, Chu P, Fakih M. Response to pd-1 blockade in microsatellite stable metastatic colorectal cancer harboring a pole mutation. J Natl Compr Cancer Netw Jnccn. 2017; 15(2):142.","journal-title":"J Natl Compr Cancer Netw Jnccn"},{"issue":"7","key":"3349_CR15","doi-asserted-by":"publisher","first-page":"1015","DOI":"10.1093\/bioinformatics\/btt755","volume":"30","author":"B Niu","year":"2014","unstructured":"Niu B, Ye K, Zhang Q, Lu C, Xie M, Mclellan MD, Wendl MC, Ding L. Msisensor: microsatellite instability detection using paired tumor-normal sequence data. Bioinformatics. 2014; 30(7):1015\u20136.","journal-title":"Bioinformatics"},{"issue":"9","key":"3349_CR16","doi-asserted-by":"publisher","first-page":"1192","DOI":"10.1373\/clinchem.2014.223677","volume":"60","author":"SJ Salipante","year":"2014","unstructured":"Salipante SJ, Scroggins SM, Hampel HL, Turner EH, Pritchard CC. Microsatellite instability detection by next generation sequencing. Clin Chem. 2014; 60(9):1192\u20139.","journal-title":"Clin Chem"},{"issue":"5","key":"3349_CR17","doi-asserted-by":"publisher","first-page":"7452","DOI":"10.18632\/oncotarget.13918","volume":"8","author":"EA Kautto","year":"2017","unstructured":"Kautto EA, Bonneville R, Miya J, Yu L, Krook MA, Reeser JW, Roychowdhury S. Performance evaluation for rapid detection of pan-cancer microsatellite instability with mantis. Oncotarget. 2017; 8(5):7452.","journal-title":"Oncotarget"},{"issue":"1","key":"3349_CR18","doi-asserted-by":"publisher","first-page":"13321","DOI":"10.1038\/srep13321","volume":"5","author":"MN Huang","year":"2015","unstructured":"Huang MN, Mcpherson JR, Cutcutache I, Teh BT, Tan P, Rozen SG. Msiseq: Software for assessing microsatellite instability from catalogs of somatic mutations. Sci Rep. 2015; 5(1):13321.","journal-title":"Sci Rep"},{"key":"3349_CR19","doi-asserted-by":"publisher","unstructured":"Wang C, Liang C. Msipred: a python package for tumor microsatellite instability classification from tumor mutation annotation data using a support vector machine. Sci Rep. 2018; 8(1). https:\/\/doi.org\/10.1038\/s41598-018-35682-z.","DOI":"10.1038\/s41598-018-35682-z"},{"issue":"23","key":"3349_CR20","doi-asserted-by":"publisher","first-page":"3799","DOI":"10.1093\/bioinformatics\/btx507","volume":"33","author":"S Foltz","year":"2017","unstructured":"Foltz S, Liang WW, Xie M, Ding L. Mirmmr: binary classification of microsatellite instability using methylation and mutations. Bioinformatics. 2017; 33(23):3799\u2013801.","journal-title":"Bioinformatics"},{"issue":"5","key":"3349_CR21","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1038\/nbt.2203","volume":"30","author":"SL Carter","year":"2012","unstructured":"Carter SL, Kristian C, Elena H, Aaron MK, Hui S, Travis Z, Laird PW, Onofrio RC, Wendy W, Weir BA. Absolute quantification of somatic dna alterations in human cancer. Nat Biotechnol. 2012; 30(5):413\u201321.","journal-title":"Nat Biotechnol"},{"key":"3349_CR22","doi-asserted-by":"publisher","unstructured":"Yu G, Zhao Z, Liu R, Tian Z, Jing X, Yi H, Zhang X, Xiao X, Wang J. Accurately estimating tumor purity of samples with high degree of heterogeneity from cancer sequencing data. In: Intelligent Computing Theories and Application: 2017. p. 273\u2013285. https:\/\/doi.org\/10.1007\/978-3-319-63312-1_25.","DOI":"10.1007\/978-3-319-63312-1_25"},{"issue":"18","key":"3349_CR23","doi-asserted-by":"publisher","first-page":"10774","DOI":"10.1073\/pnas.95.18.10774","volume":"95","author":"S Kruglyak","year":"1998","unstructured":"Kruglyak S, Durrett RT, Schug MD, Aquadro CF. Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations. Proc Natl Acad Sci U S A. 1998; 95(18):10774\u20138.","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"4","key":"3349_CR24","doi-asserted-by":"publisher","first-page":"414","DOI":"10.1007\/PL00006161","volume":"44","author":"G I. Bell","year":"1997","unstructured":"I. Bell G, Jurka J. The length distribution of perfect dimer repetitive dna is consistent with its evolution by an unbiased single-step mutation process. J Mol Evol. 1997; 44(4):414\u201321.","journal-title":"J Mol Evol"},{"issue":"1","key":"3349_CR25","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1002\/1097-0142(20010701)92:1<92::AID-CNCR1296>3.0.CO;2-W","volume":"92","author":"CW Wu","year":"2015","unstructured":"Wu CW, Chen GD, Jiang KC, Li AF, Chi CW, Lo SS, Chen JY. A genome-wide study of microsatellite instability in advanced gastric carcinoma. Cancer. 2015; 92(1):92\u2013101.","journal-title":"Cancer"},{"issue":"20","key":"3349_CR26","doi-asserted-by":"publisher","first-page":"2843","DOI":"10.1093\/bioinformatics\/btu356","volume":"30","author":"H Li","year":"2014","unstructured":"Li H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics. 2014; 30(20):2843\u201351.","journal-title":"Bioinformatics"},{"issue":"1","key":"3349_CR27","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1186\/s12864-019-5516-5","volume":"20","author":"S Srivastava","year":"2019","unstructured":"Srivastava S, Avvaru A, Sowpati DT, Mishra RK. Patterns of microsatellite distribution across eukaryotic genomes. BMC Genomics. 2019; 20(1):153.","journal-title":"BMC Genomics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-3349-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-020-3349-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-3349-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,11]],"date-time":"2021-03-11T00:04:04Z","timestamp":1615421044000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-3349-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3]]},"references-count":27,"journal-issue":{"issue":"S2","published-print":{"date-parts":[[2020,3]]}},"alternative-id":["3349"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-3349-5","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3]]},"assertion":[{"value":"11 March 2020","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"82"}}