{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,21]],"date-time":"2025-09-21T17:39:34Z","timestamp":1758476374929},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2007,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>With microarray technology, variability in experimental environments such as RNA sources, microarray production, or the use of different platforms, can cause bias. Such systematic differences present a substantial obstacle to the analysis of microarray data, resulting in inconsistent and unreliable information. Therefore, one of the most pressing challenges in the field of microarray technology is how to integrate results from different microarray experiments or combine data sets prior to the specific analysis.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Two microarray data sets based on a 17k cDNA microarray system were used, consisting of 82 normal colon mucosa and 72 colorectal cancer tissues. Each data set was prepared from either total RNA or amplified mRNA, and the difference of RNA source between these two data sets was detected by ANOVA (Analysis of variance) model. A simple integration method was introduced which was based on the distributions of gene expression ratios among different microarray data sets. The method transformed gene expression ratios into the form of a reference data set on a gene by gene basis. Hierarchical clustering analysis, density and box plots, and mixture scores with correlation coefficients revealed that the two data sets were well intermingled, indicating that the proposed method minimized the experimental bias. In addition, any RNA source effect was not detected by the proposed transformation method. In the mixed data set, two previously identified subgroups of normal and tumor were well separated, and the efficiency of integration was more prominent in tumor groups than normal groups. The transformation method was slightly more effective when a data set with strong homogeneity in the same experimental group was used as a reference data set.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>Proposed method is simple but useful to combine several data sets from different experimental conditions. With this method, biologically useful information can be detectable by applying various analytic methods to the combined data set with increased sample size.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-8-218","type":"journal-article","created":{"date-parts":[[2007,6,26]],"date-time":"2007-06-26T06:13:32Z","timestamp":1182838412000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Novel and simple transformation algorithm for combining microarray data sets"],"prefix":"10.1186","volume":"8","author":[{"given":"Ki-Yeol","family":"Kim","sequence":"first","affiliation":[]},{"given":"Dong Hyuk","family":"Ki","sequence":"additional","affiliation":[]},{"given":"Ha Jin","family":"Jeong","sequence":"additional","affiliation":[]},{"given":"Hei-Cheul","family":"Jeung","sequence":"additional","affiliation":[]},{"given":"Hyun Cheol","family":"Chung","sequence":"additional","affiliation":[]},{"given":"Sun Young","family":"Rha","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2007,6,25]]},"reference":[{"key":"1590_CR1","doi-asserted-by":"publisher","first-page":"978","DOI":"10.1128\/EC.1.6.978-986.2002","volume":"1","author":"R Breitling","year":"2002","unstructured":"Breitling R, Sharif O, Hartman ML, Krisans SK: Loss of compartmentalization causes misregulation of lysine biosynthesis in peroxisome-deficient yeast cells. Eukaryot Cell. 2002, 1: 978-986. 10.1128\/EC.1.6.978-986.2002.","journal-title":"Eukaryot Cell"},{"key":"1590_CR2","doi-asserted-by":"publisher","first-page":"I84","DOI":"10.1093\/bioinformatics\/btg1010","volume":"19","author":"JK Choi","year":"2003","unstructured":"Choi JK, Yu U, Kim S, Yoo OJ: Combining multiple microarray studies and modeling interstudy variation. Bioinformatics(Suppl). 2003, 19: I84-I90. 10.1093\/bioinformatics\/btg1010.","journal-title":"Bioinformatics(Suppl)"},{"key":"1590_CR3","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1016\/S0014-5793(03)00522-2","volume":"546","author":"V Detours","year":"2003","unstructured":"Detours V, Dumont JE, Bersini H, Maenhaut C: Integration and cross-validation of high-throughput gene expression data: Comparing heterogeneous data sets. FEBS Letters. 2003, 546: 98-102. 10.1016\/S0014-5793(03)00522-2.","journal-title":"FEBS Letters"},{"key":"1590_CR4","doi-asserted-by":"publisher","first-page":"292","DOI":"10.1101\/gr.217802","volume":"12","author":"PD Lee","year":"2002","unstructured":"Lee PD, Sladek R, Greenwood CM, Hudson TJ: Control genes and variability: Absence of ubiquitous reference transcripts in diverse mammalian expression studies. Genome Research. 2002, 12: 292-297. 10.1101\/gr.217802.","journal-title":"Genome Research"},{"key":"1590_CR5","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1038\/ng1060","volume":"33","author":"S Ramaswamy","year":"2003","unstructured":"Ramaswamy S, Ross KN, Lander ES, Golub TR: A molecular signature of metastasis in primary solid tumors. Nat Genet. 2003, 33: 49-54. 10.1038\/ng1060.","journal-title":"Nat Genet"},{"key":"1590_CR6","first-page":"4427","volume":"62","author":"DR Rhodes","year":"2002","unstructured":"Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM: Meta-analysis of microarrays: Interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res. 2002, 62: 4427-4433.","journal-title":"Cancer Res"},{"key":"1590_CR7","doi-asserted-by":"publisher","first-page":"8418","DOI":"10.1073\/pnas.0932692100","volume":"100","author":"T Sorlie","year":"2004","unstructured":"Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, Deng S, Johnsen H, Pesich R, Geisler S: Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA. 2004, 100: 8418-8423. 10.1073\/pnas.0932692100.","journal-title":"Proc Natl Acad Sci USA"},{"key":"1590_CR8","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1016\/j.febslet.2004.03.081","volume":"565","author":"JK Choi","year":"2004","unstructured":"Choi JK, Choi JY, Kim DG, Choi DW, Kim BY, Lee KH, Yeom YI, Yoo HS, Yoo OJ, Kim SS: Integrative analysis of multiple gene expression profiles applied to liver cancer study. FEBS Letters. 2004, 565: 93-100. 10.1016\/j.febslet.2004.05.087.","journal-title":"FEBS Letters"},{"issue":"3","key":"1590_CR9","doi-asserted-by":"publisher","first-page":"405","DOI":"10.1093\/bioinformatics\/18.3.405","volume":"18","author":"WP Kuo","year":"2002","unstructured":"Kuo WP, Jenssen TK, Butte AJ, Lucila OM, Kohane IS: Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics. 2002, 18 (3): 405-412. 10.1093\/bioinformatics\/18.3.405.","journal-title":"Bioinformatics"},{"issue":"3","key":"1590_CR10","first-page":"110","volume":"4","author":"KY Kim","year":"2006","unstructured":"Kim KY, Chung HC, Jeung HC, Shin JH, Kim TS, Rha SY: Significant gene selection using integrated microarray data set with batch effect. Genomics & Informatics. 2006, 4 (3): 110-117.","journal-title":"Genomics & Informatics"},{"key":"1590_CR11","doi-asserted-by":"publisher","first-page":"10101","DOI":"10.1073\/pnas.97.18.10101","volume":"97","author":"O Alter","year":"2000","unstructured":"Alter O, Patrick OB, David B: Singular value decomposition for genome-wide expression data processing and modelling. Proc Natl Acad Sci USA. 2000, 97: 10101-10106. 10.1073\/pnas.97.18.10101.","journal-title":"Proc Natl Acad Sci USA"},{"key":"1590_CR12","doi-asserted-by":"publisher","first-page":"1301","DOI":"10.1016\/S0140-6736(02)08270-3","volume":"359","author":"TO Nielsen","year":"2002","unstructured":"Nielsen TO, West RB, Linn SC, Alter O, Knowling MA, O'Connell J, Zhu S, Fero M, Sherlock G, Pollack JR, Patrick OB, Botstein D, Rijn M: Molecular characterisation of soft tissue tumours: a gene expression study. Lancet. 2002, 359: 1301-1307. 10.1016\/S0140-6736(02)08270-3.","journal-title":"Lancet"},{"key":"1590_CR13","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1093\/bioinformatics\/btg385","volume":"20","author":"M Benito","year":"2004","unstructured":"Benito M, Parker J, Du Q, Wu J, Xiang D, Perou CM, Marron JS: Adjustment of systematic microarray data biases. Bioinformatics. 2004, 20: 105-114. 10.1093\/bioinformatics\/btg385.","journal-title":"Bioinformatics"},{"key":"1590_CR14","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1186\/1471-2105-5-81","volume":"5","author":"H Jiang","year":"2004","unstructured":"Jiang H, Deng Y, Chen HS, Tao L, Sha Q, Chen J, Tsai CJ, Zhang S: Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes. BMC Bioinformatics. 2004, 5: 81-10.1186\/1471-2105-5-81.","journal-title":"BMC Bioinformatics"},{"issue":"14","key":"1590_CR15","doi-asserted-by":"publisher","first-page":"1682","DOI":"10.1093\/bioinformatics\/btl183","volume":"2","author":"TS Park","year":"2006","unstructured":"Park TS, Yi SG, Shin YK, Lee SY: Combining multiple microarrays in the presence of controlling variables. Bioinformatics. 2006, 2 (14): 1682-1689. 10.1093\/bioinformatics\/btl183.","journal-title":"Bioinformatics"},{"issue":"9","key":"1590_CR16","doi-asserted-by":"publisher","first-page":"2903","DOI":"10.1093\/hmg\/ddl231","volume":"15","author":"Z Kemp","year":"2006","unstructured":"Kemp Z, Carvajal-Carmona L, Spain S, Barclay E, Gorman M, Martin L, Jaeger E, Brooks N, Bishop DT, Thomas H, Tomlinson I, Papaemmanuil E, Webb E, Sellick GS, Wood W, Evans G, Lucassen A, Maher ER, Houlston RS: Evidence for a colorectal cancer usceptibility locus on chromosome 3q21-q24 from a high-density SNP genome-wide linkage scan. Human Molecular Genetics. 2006, 15 (9): 2903-2910. 10.1093\/hmg\/ddl231.","journal-title":"Human Molecular Genetics"},{"issue":"1","key":"1590_CR17","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1093\/carcin\/bgl086","volume":"28","author":"LC Andersen","year":"2007","unstructured":"Andersen LC, Wiuf C, Kruh\u00f8ffer M, Korsgaard M, Laurberg S, \u00d8rntoft TF: Frequent occurrence of uniparental disomy in colorectal cancer. Carcinogenesis. 2007, 28 (1): 38-48. 10.1093\/carcin\/bgl086.","journal-title":"Carcinogenesis"},{"key":"1590_CR18","doi-asserted-by":"publisher","first-page":"1239","DOI":"10.1038\/sj.bjc.6603421","volume":"95","author":"A Colebatch","year":"2006","unstructured":"Colebatch A, Hitchins M, Williams M, Meagher A, Hawkins NJ, Ward RL: The role of MYH and microsatellite instability in the development of sporadic colorectal cancer. British Journal of Cancer. 2006, 95: 1239-1243. 10.1038\/sj.bjc.6603421.","journal-title":"British Journal of Cancer"},{"key":"1590_CR19","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1158\/1078-0432.79.11.1","volume":"11","author":"TM Kim","year":"2005","unstructured":"Kim TM, Jeong HJ, Seo MY, Kim SC, Cho G, Park CH, Kim TS, Park KH, Chung HC, Rha SY: Determination of Genes Related to Gastrointestinal Tract Origin Cancer Cells Using a cDNA Microarray. Clinical Cancer Research. 2005, 11: 79-86.","journal-title":"Clinical Cancer Research"},{"issue":"4","key":"1590_CR20","doi-asserted-by":"crossref","first-page":"906","DOI":"10.2144\/02334mt04","volume":"33","author":"AJ Feldman","year":"2002","unstructured":"Feldman AJ, Costouros NG, Wang E, Qian M, Marincola FM, Alexander HR, Libutti SK: Advantages of mRNA amplification for microarray analysis. Biotechniques. 2002, 33 (4): 906-914.","journal-title":"Biotechniques"},{"key":"1590_CR21","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1186\/1471-2164-5-29","volume":"5","author":"J Schneider","year":"2004","unstructured":"Schneider J, Bune\u00df A, Huber A, Volz J, Kioschis P, Hafner M, Poustka A, S\u00fcltmann H: Systematic analysis of T7 RNA polymerase based in vitro linear RNA amplification for use in microarray experiments. BMC Genomics. 2004, 5: 29-10.1186\/1471-2164-5-29.","journal-title":"BMC Genomics"},{"key":"1590_CR22","unstructured":"R: A language and environment for statistical computing. [http:\/\/www.R-project.org]"},{"key":"1590_CR23","first-page":"1","volume-title":"Random Forests","author":"L Breiman","year":"2001","unstructured":"Breiman L: Random Forests. 2001, Statistics Department, University of California, Berkeley, 1-33."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-8-218.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,12]],"date-time":"2023-05-12T17:54:29Z","timestamp":1683914069000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-8-218"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,6,25]]},"references-count":23,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2007,12]]}},"alternative-id":["1590"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-8-218","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,6,25]]},"assertion":[{"value":"20 December 2006","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 June 2007","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 June 2007","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"218"}}