{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,28]],"date-time":"2025-11-28T17:11:36Z","timestamp":1764349896772},"reference-count":22,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2007,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Tiling microarrays are becoming an essential technology in the functional genomics toolbox. They have been applied to the tasks of novel transcript identification, elucidation of transcription factor binding sites, detection of methylated DNA and several other applications in several model organisms. These experiments are being conducted at increasingly finer resolutions as the microarray technology enjoys increasingly greater feature densities. The increased densities naturally lead to increased data analysis requirements. Specifically, the most widely employed algorithm for tiling array analysis involves smoothing observed signals by computing pseudomedians within sliding windows, a <jats:italic>O<\/jats:italic>(<jats:italic>n<\/jats:italic>\n              <jats:sup>2<\/jats:sup>log<jats:italic>n<\/jats:italic>) calculation in each window. This poor time complexity is an issue for tiling array analysis and could prove to be a real bottleneck as tiling microarray experiments become grander in scope and finer in resolution.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We therefore implemented Monahan's HLQEST algorithm that reduces the runtime complexity for computing the pseudomedian of <jats:italic>n<\/jats:italic> numbers to <jats:italic>O<\/jats:italic>(<jats:italic>n<\/jats:italic> log<jats:italic>n<\/jats:italic>) from <jats:italic>O<\/jats:italic>(<jats:italic>n<\/jats:italic>\n              <jats:sup>2<\/jats:sup>log<jats:italic>n<\/jats:italic>). For a representative tiling microarray dataset, this modification reduced the smoothing procedure's runtime by nearly 90%. We then leveraged the fact that elements within sliding windows remain largely unchanged in overlapping windows (as one slides across genomic space) to further reduce computation by an additional 43%. This was achieved by the application of skip lists to maintaining a sorted list of values from window to window. This sorted list could be maintained with simple <jats:italic>O<\/jats:italic>(log <jats:italic>n<\/jats:italic>) inserts and deletes. We illustrate the favorable scaling properties of our algorithms with both time complexity analysis and benchmarking on synthetic datasets.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>Tiling microarray analyses that rely upon a sliding window pseudomedian calculation can require many hours of computation. We have eased this requirement significantly by implementing efficient algorithms that scale well with genomic feature density. This result not only speeds the current standard analyses, but also makes possible ones where many iterations of the filter may be required, such as might be required in a bootstrap or parameter estimation setting. Source code and executables are available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/tiling.gersteinlab.org\/pseudomedian\/\" ext-link-type=\"uri\">http:\/\/tiling.gersteinlab.org\/pseudomedian\/<\/jats:ext-link>.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-8-186","type":"journal-article","created":{"date-parts":[[2007,6,7]],"date-time":"2007-06-07T18:14:08Z","timestamp":1181240048000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["An efficient pseudomedian filter for tiling microrrays"],"prefix":"10.1186","volume":"8","author":[{"given":"Thomas E","family":"Royce","sequence":"first","affiliation":[]},{"given":"Nicholas J","family":"Carriero","sequence":"additional","affiliation":[]},{"given":"Mark B","family":"Gerstein","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2007,6,7]]},"reference":[{"key":"1558_CR1","doi-asserted-by":"publisher","first-page":"1262","DOI":"10.1038\/82367","volume":"18","author":"DW Selinger","year":"2000","unstructured":"Selinger DW, Cheung KJ, Mei R, Johansson EM, Richmond CS, Blattner FR, Lockhart DJ, Church GM: RNA expression analysis using a 30 base pair resolution Escherichia coli genome array. Nat Biotechnol. 2000, 18: 1262-1268. 10.1038\/82367.","journal-title":"Nat Biotechnol"},{"key":"1558_CR2","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1101\/gr.4452906","volume":"16","author":"P Bertone","year":"2006","unstructured":"Bertone P, Trifonov V, Rozowsky JS, Schubert F, Emanuelsson O, Karro J, Kao M, Snyder M, Gerstein M: Design optimization methods for genomic DNA tiling arrays. Genome Res. 2006, 16: 271-281. 10.1101\/gr.4452906.","journal-title":"Genome Res"},{"key":"1558_CR3","doi-asserted-by":"publisher","first-page":"903","DOI":"10.1038\/nmeth951","volume":"3","author":"SJ Wheelan","year":"2006","unstructured":"Wheelan SJ, Martinez-Murillo F, Irizarry RA, Boeke JD: Stacking the deck: double-tiled DNA microarrays. Nat Methods. 2006, 3: 903-907. 10.1038\/nmeth951.","journal-title":"Nat Methods"},{"key":"1558_CR4","doi-asserted-by":"publisher","first-page":"636","DOI":"10.1126\/science.1105136","volume":"306","author":"ENCODE Consortium","year":"2004","unstructured":"ENCODE Consortium: The ENCODE (ENCyclopedia Of DNA Elements) project. Science. 2004, 306: 636-640. 10.1126\/science.1105136.","journal-title":"Science"},{"key":"1558_CR5","doi-asserted-by":"publisher","first-page":"466","DOI":"10.1016\/j.tig.2005.06.007","volume":"21","author":"TE Royce","year":"2005","unstructured":"Royce TE, Rozowsky JS, Bertone P, Samanta M, Stolc V, Weissman S, Snyder M, Gerstein M: Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping. Trends Genet. 2005, 21: 466-475. 10.1016\/j.tig.2005.06.007.","journal-title":"Trends Genet"},{"key":"1558_CR6","doi-asserted-by":"publisher","first-page":"916","DOI":"10.1126\/science.1068597","volume":"296","author":"P Kapranov","year":"2002","unstructured":"Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SPA, Gingeras TR: Large-scale transcriptional activity in chromosomes 21 and 22. Science. 2002, 296: 916-919. 10.1126\/science.1068597.","journal-title":"Science"},{"key":"1558_CR7","doi-asserted-by":"publisher","first-page":"R73","DOI":"10.1186\/gb-2004-5-10-r73","volume":"5","author":"EE Schadt","year":"2004","unstructured":"Schadt EE, Edwards SW, GuhaThakurta D, Holder D, Ying L, Svetnik V, Leonardson A, Hart KW, Russell A, Li G, Cavet G, Castle J, McDonagh P, Kan Z, Chen R, Kasarskis A, Margarint M, Caceres RM, Johnson JM, Armour CD, Garrett-Engele PW, Tsinoremas NF, Shoemaker DD: A comprehensive transcript index of the human genome generated using microarrays and computational approaches. Genome Biol. 2004, 5: R73-10.1186\/gb-2004-5-10-r73.","journal-title":"Genome Biol"},{"key":"1558_CR8","doi-asserted-by":"publisher","first-page":"2242","DOI":"10.1126\/science.1103388","volume":"306","author":"P Bertone","year":"2004","unstructured":"Bertone P, Stolc V, Royce TE, Rozowsky JS, Urban AE, Zhu X, Rinn JL, Tongprasit W, Samanta M, Weissman S, Gerstein M, Snyder M: Global identification of human transcribed sequences with genome tiling arrays. Science. 2004, 306: 2242-2246. 10.1126\/science.1103388.","journal-title":"Science"},{"key":"1558_CR9","doi-asserted-by":"publisher","first-page":"331","DOI":"10.1101\/gr.2094104","volume":"14","author":"D Kampa","year":"2004","unstructured":"Kampa D, Cheng J, Kapranov P, Yamanaka M, Brubaker S, Cawley S, Drenkow J, Piccolboni A, Bekiranov S, Helt G, Tammana H, Gingeras TR: Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 2004, 14: 331-342. 10.1101\/gr.2094104.","journal-title":"Genome Res"},{"key":"1558_CR10","doi-asserted-by":"publisher","first-page":"1963","DOI":"10.1093\/bioinformatics\/btl289","volume":"22","author":"W Huber","year":"2006","unstructured":"Huber W, Toedling J, Steinmetz LM: Transcript mapping with high-density oligonucleotide tiling arrays. Bioinformatics. 2006, 22: 1963-1970. 10.1093\/bioinformatics\/btl289.","journal-title":"Bioinformatics"},{"key":"1558_CR11","doi-asserted-by":"publisher","first-page":"3016","DOI":"10.1093\/bioinformatics\/btl515","volume":"22","author":"J Du","year":"2006","unstructured":"Du J, Rozowsky JS, Korbel JO, Zhang ZD, Royce TE, Schultz MH, Snyder M, Gerstein M: A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chip-chip experiments: systematically incorporating validated biological knowledge. Bioinformatics. 2006, 22: 3016-3024. 10.1093\/bioinformatics\/btl515.","journal-title":"Bioinformatics"},{"key":"1558_CR12","doi-asserted-by":"publisher","first-page":"3629","DOI":"10.1093\/bioinformatics\/bti593","volume":"21","author":"H Ji","year":"2005","unstructured":"Ji H, Wong WH: Tilemap: create chromosomal map of tiling array hybridizations. Bioinformatics. 2005, 21: 3629-3636. 10.1093\/bioinformatics\/bti593.","journal-title":"Bioinformatics"},{"issue":"Suppl 1","key":"1558_CR13","doi-asserted-by":"publisher","first-page":"i274","DOI":"10.1093\/bioinformatics\/bti1046","volume":"21","author":"W Li","year":"2005","unstructured":"Li W, Meyer CA, Liu XS: A hidden markov model for analyzing chip-chip experiments on genome tiling arrays and its application to p53 binding sequences. Bioinformatics. 2005, 21 (Suppl 1): i274-82. 10.1093\/bioinformatics\/bti1046.","journal-title":"Bioinformatics"},{"key":"1558_CR14","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1186\/1471-2105-7-239","volume":"7","author":"K Munch","year":"2006","unstructured":"Munch K, Gardner PP, Arctander P, Krogh A: A hidden markov model approach for determining expression from genomic tiling micro arrays. BMC Bioinformatics. 2006, 7: 239-10.1186\/1471-2105-7-239.","journal-title":"BMC Bioinformatics"},{"key":"1558_CR15","doi-asserted-by":"publisher","first-page":"991","DOI":"10.1038\/ng1630","volume":"37","author":"BJ Frey","year":"2005","unstructured":"Frey BJ, Mohammad N, Morris QD, Zhang W, Robinson MD, Mnaimneh S, Chang R, Pan Q, Sat E, Rossant J, Bruneau BG, Aubin JE, Blencowe BJ, Hughes TR: Genome-wide analysis of mouse transcripts using exon microarrays and factor graphs. Nat Genet. 2005, 37: 991-996. 10.1038\/ng1630.","journal-title":"Nat Genet"},{"key":"1558_CR16","volume-title":"Bioinformatics","author":"T Royce","year":"2007","unstructured":"Royce T, Rozowsky J, Gerstein M: Assessing the need for sequence-based normalization in tiling microarray experiments. Bioinformatics. 2007,"},{"key":"1558_CR17","first-page":"858","volume":"36","author":"P Bickel","year":"1965","unstructured":"Bickel P: On some robust estimates of location. Annals of Mathematical Statistics. 1965, 36: 858-","journal-title":"Annals of Mathematical Statistics"},{"key":"1558_CR18","doi-asserted-by":"publisher","first-page":"83","DOI":"10.2307\/3001969","volume":"1","author":"F Wilcoxon","year":"1945","unstructured":"Wilcoxon F: Individual comparisons by ranking methods. Biometrics Bulletin. 1945, 1: 83-10.2307\/3001969.","journal-title":"Biometrics Bulletin"},{"key":"1558_CR19","doi-asserted-by":"publisher","first-page":"270","DOI":"10.1145\/1271.319414","volume":"10","author":"J Monahan","year":"1984","unstructured":"Monahan J: Fast computation of the Hodges-Lehman location estimator. ACM Transactions on Mathematical Software. 1984, 10: 270-10.1145\/1271.319414.","journal-title":"ACM Transactions on Mathematical Software"},{"key":"1558_CR20","doi-asserted-by":"publisher","first-page":"676","DOI":"10.1145\/78973.78977","volume":"33","author":"W Pugh","year":"1990","unstructured":"Pugh W: Skip lists \u2013 a probabilistic alternative to balanced trees. Communications of the ACM. 1990, 33: 676-10.1145\/78973.78977.","journal-title":"Communications of the ACM"},{"key":"1558_CR21","volume-title":"Genome Res","author":"O Emanuelsson","year":"2006","unstructured":"Emanuelsson O, Nagalakshmi U, Zheng D, Rozowsky J, Urban A, Du J, Lian Z, Stolc V, Weissman S, Snyder M, Gerstein M: Assessing the performance of different high-density tiling microarray strategies for mapping transcribed regions of the human genome. Genome Res. 2006,"},{"key":"1558_CR22","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1093\/nar\/30.1.207","volume":"30","author":"R Edgar","year":"2002","unstructured":"Edgar R, Domrachev M, Lash AE: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002, 30: 207-210. 10.1093\/nar\/30.1.207.","journal-title":"Nucleic Acids Res"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-8-186.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T10:17:39Z","timestamp":1630491459000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-8-186"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,6,7]]},"references-count":22,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2007,12]]}},"alternative-id":["1558"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-8-186","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,6,7]]},"assertion":[{"value":"30 March 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 June 2007","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 June 2007","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"186"}}