{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,20]],"date-time":"2025-06-20T23:23:46Z","timestamp":1750461826339},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2004,12,7]],"date-time":"2004-12-07T00:00:00Z","timestamp":1102377600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/2.0\/"},{"start":{"date-parts":[[2004,12,7]],"date-time":"2004-12-07T00:00:00Z","timestamp":1102377600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/2.0\/"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                        <jats:title>Background<\/jats:title>\n                        <jats:p>Expression microarrays are increasingly used to characterize environmental responses and host-parasite interactions for many different organisms. Probe selection for cDNA microarrays using expressed sequence tags (ESTs) is challenging due to high sequence redundancy and potential cross-hybridization between paralogous genes. In organisms with limited genomic information, like marine organisms, this challenge is even greater due to annotation uncertainty. No general tool is available for cDNA microarray probe selection for these organisms. Therefore, the goal of the design procedure described here is to select a subset of ESTs that will minimize sequence redundancy and characterize potential cross-hybridization while providing functionally representative probes.<\/jats:p>\n                     <\/jats:sec><jats:sec>\n                        <jats:title>Results<\/jats:title>\n                        <jats:p>Sequence similarity between ESTs, quantified by the E-value of pair-wise alignment, was used as a surrogate for expected hybridization between corresponding sequences. Using this value as a measure of dissimilarity, sequence redundancy reduction was performed by hierarchical cluster analyses. The choice of how many microarray probes to retain was made based on an index developed for this research: a sequence diversity index (SDI) within a sequence diversity plot (SDP). This index tracked the decreasing within-cluster sequence diversity as the number of clusters increased. For a given stage in the agglomeration procedure, the EST having the highest similarity to all the other sequences within each cluster, the centroid EST, was selected as a microarray probe. A small dataset of ESTs from Atlantic white shrimp (<jats:italic>Litopenaeus setiferus<\/jats:italic>) was used to test this algorithm so that the detailed results could be examined. The functional representative level of the selected probes was quantified using Gene Ontology (GO) annotations.<\/jats:p>\n                     <\/jats:sec><jats:sec>\n                        <jats:title>Conclusions<\/jats:title>\n                        <jats:p>For organisms with limited genomic information, combining hierarchical clustering methods to analyze ESTs can yield an optimal cDNA microarray design. If biomarker discovery is the goal of the microarray experiments, the average linkage method is more effective, while single linkage is more suitable if identification of physiological mechanisms is more of interest. This general design procedure is not limited to designing single-species cDNA microarrays for marine organisms, and it can equally be applied to multiple-species microarrays of any organisms with limited genomic information.<\/jats:p>\n                     <\/jats:sec>","DOI":"10.1186\/1471-2105-5-191","type":"journal-article","created":{"date-parts":[[2005,1,13]],"date-time":"2005-01-13T12:42:43Z","timestamp":1105620163000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information"],"prefix":"10.1186","volume":"5","author":[{"given":"Yian A","family":"Chen","sequence":"first","affiliation":[]},{"given":"David J","family":"Mckillen","sequence":"additional","affiliation":[]},{"given":"Shuyuan","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Matthew J","family":"Jenny","sequence":"additional","affiliation":[]},{"given":"Robert","family":"Chapman","sequence":"additional","affiliation":[]},{"given":"Paul S","family":"Gross","sequence":"additional","affiliation":[]},{"given":"Gregory W","family":"Warr","sequence":"additional","affiliation":[]},{"given":"Jonas S","family":"Almeida","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2004,12,7]]},"reference":[{"key":"307_CR1","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1038\/nrg1293","volume":"5","author":"LM Steinmetz","year":"2004","unstructured":"Steinmetz LM, Davis RW: Maximizing the potential of functional genomic.\n                           Nature Reviews Genetics 2004, 5: 190 -1201. 10.1038\/nrg1293","journal-title":"Nature Reviews Genetics"},{"key":"307_CR2","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1016\/S0378-1119(02)01149-6","volume":"303","author":"Y Gueguen","year":"2003","unstructured":"Gueguen Y, Cadoret JP, Flament D, Barreau-Roumiguiere C, Girardot AL, Garnier J, Hoareau A, Bachere E, Escoubas JM: Immune gene discovery by expressed sequence tags generated from hemocytes of the bacteria-challenged oyster, Crassostrea gigas.\n                           Gene 2003, 303: 139\u2013145. 10.1016\/S0378-1119(02)01149-6","journal-title":"Gene"},{"key":"307_CR3","doi-asserted-by":"publisher","first-page":"565","DOI":"10.1016\/S0145-305X(01)00018-0","volume":"25","author":"PS Gross","year":"2001","unstructured":"Gross PS, Bartlett TC, Browdy CL, Chapman RW, Warr GW: Immune gene discovery by expressed sequence tag analysis of hemocytes and hepatopancreas in the Pacific white shrimp, Litopenaeus vannamei, and the Atlantic white shrimp, L. setiferus.\n                           Dev Comp Immunol 2001, 25: 565\u2013577. 10.1016\/S0145-305X(01)00018-0","journal-title":"Dev Comp Immunol"},{"key":"307_CR4","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1007\/s10126-001-0072-8","volume":"4","author":"MJ Jenny","year":"2002","unstructured":"Jenny MJ, Ringwood AH, Lacy ER, Lewitus AJ, Kempton JW, Gross PS, Warr GW, Chapman RW: Potential indicators of stress response identified by expressed sequence tag analysis of hemocytes and embryos from the American oyster, Crassostrea virginica.\n                           Mar Biotechnol 2002, 4: 81\u201393. 10.1007\/s10126-001-0072-8","journal-title":"Mar Biotechnol"},{"key":"307_CR5","first-page":"442","volume":"19","author":"R Lipshutz","year":"1995","unstructured":"Lipshutz R, Morris D, Chee M, Hubbell E, Kozal MJ, Shah N, Shen N, Yang R, Fodor SP: Using oligonucleotide probe arrays to access genetic diversity.\n                           Biotechniques 1995, 19: 442\u2013447.","journal-title":"Biotechniques"},{"key":"307_CR6","doi-asserted-by":"publisher","first-page":"467","DOI":"10.1126\/science.270.5235.467","volume":"270","author":"M Schena","year":"1995","unstructured":"Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray.\n                           Science 1995, 270: 467\u2013467.","journal-title":"Science"},{"key":"307_CR7","doi-asserted-by":"publisher","first-page":"3491","DOI":"10.1093\/nar\/gkg622","volume":"31","author":"HB Nielsen","year":"2003","unstructured":"Nielsen HB, Wernersson R, Knudsen S: Design of oligonucleotides for microarrays and perspectives for design of multi-transcriptome arrays.\n                           Nucl Acids Res 2003, 31: 3491\u20133496. 10.1093\/nar\/gkg622","journal-title":"Nucl Acids Res"},{"key":"307_CR8","doi-asserted-by":"publisher","first-page":"3758","DOI":"10.1093\/nar\/gkg580","volume":"31","author":"N Tolstrup","year":"2003","unstructured":"Tolstrup N, Nielsen PS, Kolberg JG, Frankel AM, Vissing H, Kauppinen S: OligoDesign: optimal design of LNA (locked nucleic acid) oligonucleotide capture probes for gene expression profiling.\n                           Nucl Acids Res 2003, 31: 3758\u20133762. 10.1093\/nar\/gkg580","journal-title":"Nucl Acids Res"},{"key":"307_CR9","doi-asserted-by":"publisher","first-page":"3746","DOI":"10.1093\/nar\/gkg569","volume":"31","author":"SJ Emrich","year":"2003","unstructured":"Emrich SJ, Lowe M, Delcher AL: PROBEmer: a web-based software tool for selecting optimal DNA oligos.\n                           Nucl Acids Res 2003, 31: 3746\u20133750. 10.1093\/nar\/gkg569","journal-title":"Nucl Acids Res"},{"key":"307_CR10","doi-asserted-by":"publisher","first-page":"1067","DOI":"10.1093\/bioinformatics\/17.11.1067","volume":"17","author":"F Li","year":"2001","unstructured":"Li F, Stormo GD: Selection of optimal DNA oligos for gene expression arrays.\n                           Bioinformatics 2001, 17: 1067\u20131076. 10.1093\/bioinformatics\/17.11.1067","journal-title":"Bioinformatics"},{"key":"307_CR11","doi-asserted-by":"publisher","first-page":"796","DOI":"10.1093\/bioinformatics\/btg086","volume":"19","author":"X Wang","year":"2003","unstructured":"Wang X, Seed B: Selection of oligonucleotide probes for protein coding sequences.\n                           Bioinformatics 2003, 19: 796\u2013802. 10.1093\/bioinformatics\/btg086","journal-title":"Bioinformatics"},{"key":"307_CR12","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1093\/bioinformatics\/17.1.98","volume":"17","author":"G Raddatz","year":"2001","unstructured":"Raddatz G, Dehio M, Meyer TF, Dehio C: PrimeArray: genome-scale primer design for DNA-microarray construction.\n                           Bioinformatics 2001, 17: 98\u201399. 10.1093\/bioinformatics\/17.1.98","journal-title":"Bioinformatics"},{"key":"307_CR13","doi-asserted-by":"publisher","first-page":"1432","DOI":"10.1093\/bioinformatics\/18.11.1432","volume":"18","author":"D Xu","year":"2002","unstructured":"Xu D, Li G, Wu L, Zhou J, Xu Y: PRIMEGENS: robust and efficient design of gene-specific probes for microarray analysis.\n                           Bioinformatics 2002, 18: 1432\u20131437. 10.1093\/bioinformatics\/18.11.1432","journal-title":"Bioinformatics"},{"key":"307_CR14","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1093\/bioinformatics\/18.2.321","volume":"18","author":"HB Nielsen","year":"2002","unstructured":"Nielsen HB, Knudsen S: Avoiding cross hybridization by choosing nonredundant targets on cDNA arrays.\n                           Bioinformatics 2002, 18: 321\u2013322. 10.1093\/bioinformatics\/18.2.321","journal-title":"Bioinformatics"},{"key":"307_CR15","doi-asserted-by":"publisher","first-page":"369","DOI":"10.1038\/ng0895-369","volume":"10","author":"MS Boguski","year":"1995","unstructured":"Boguski MS, Schuler GD: ESTablishing a human transcript map.\n                           Nature Genetics 1995, 10: 369\u2013371. 10.1038\/ng0895-369","journal-title":"Nature Genetics"},{"key":"307_CR16","doi-asserted-by":"publisher","first-page":"329","DOI":"10.1093\/bib\/2.4.329","volume":"2","author":"S Tomiuk","year":"2001","unstructured":"Tomiuk S, Hofmann K: Microarray probe selection strategies.\n                           Briefings in bioinformatics 2001, 2: 329\u2013340.","journal-title":"Briefings in bioinformatics"},{"key":"307_CR17","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1093\/nar\/28.1.141","volume":"28","author":"J Quackenbush","year":"2000","unstructured":"Quackenbush J, Liang F, Holt I, Pertea G, Upton J: The TIGR Gene Indices: reconstruction and representation of expressed gene sequences.\n                           Nucl Acids Res 2000, 28: 141\u2013145. 10.1093\/nar\/28.1.141","journal-title":"Nucl Acids Res"},{"key":"307_CR18","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1093\/nar\/29.1.159","volume":"29","author":"J Quackenbush","year":"2001","unstructured":"Quackenbush J, Cho J, Lee D, Liang F, Holt I, Karamycheva S, Parvizi B, Pertea G, Sultana R, White J: The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species.\n                           Nucl Acids Res 2001, 29: 159\u2013164. 10.1093\/nar\/29.1.159","journal-title":"Nucl Acids Res"},{"key":"307_CR19","doi-asserted-by":"publisher","first-page":"234","DOI":"10.1093\/nar\/29.1.234","volume":"29","author":"A Christoffels","year":"2001","unstructured":"Christoffels A, Gelder A, Greyling G, Miller R, Hide T, Hide W: STACK: Sequence Tag Alignment and Consensus Knowledgebase.\n                           Nucl Acids Res 2001, 29: 234\u2013238. 10.1093\/nar\/29.1.234","journal-title":"Nucl Acids Res"},{"key":"307_CR20","volume-title":"The NCBI Handbook","author":"JU Pontius","year":"2003","unstructured":"Pontius JU, Wagner L, Schuler GD: UniGene: a unified view of the transcriptome. In The NCBI Handbook. Bethesda (MD), National Center for Biotechnology Information; 2003."},{"key":"307_CR21","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1101\/gr.9.11.1135","volume":"9","author":"J Burke","year":"1999","unstructured":"Burke J, Davison D, Hide W: d2_cluster: A Validated Method for Clustering EST and Full-Length cDNA Sequences.\n                           Genome Res 1999, 9: 1135\u20131142. 10.1101\/gr.9.11.1135","journal-title":"Genome Res"},{"key":"307_CR22","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1093\/bioinformatics\/btg034","volume":"19","author":"G Pertea","year":"2003","unstructured":"Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J: TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets.\n                           Bioinformatics 2003, 19: 651\u2013652. 10.1093\/bioinformatics\/btg034","journal-title":"Bioinformatics"},{"key":"307_CR23","doi-asserted-by":"publisher","first-page":"868","DOI":"10.1101\/gr.9.9.868","volume":"9","author":"X Huang","year":"1999","unstructured":"Huang X, Madan A: CAP3: A DNA Sequence Assembly Program.\n                           Genome Research 1999, 9: 868\u2013877. 10.1101\/gr.9.9.868","journal-title":"Genome Research"},{"key":"307_CR24","unstructured":"Green P: PHRAP.[http:\/\/bozeman.mbt.washington.edu\/phrap.docs\/phrap.html]"},{"key":"307_CR25","first-page":"1","volume-title":"Applied multivariate statistical analysis","author":"RA Johnson","year":"1998","unstructured":"Johnson RA, Wichern DW: Applied multivariate statistical analysis. Fourth edition. NJ, Prentice-Hall; 1998:1\u2013816.","edition":"Fourth"},{"key":"307_CR26","doi-asserted-by":"publisher","first-page":"D258","DOI":"10.1093\/nar\/gkh036","volume":"32","author":"Gene Ontology Consortium","year":"2004","unstructured":"Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource.\n                           Nucl Acids Res 2004, 32: D258\u2013261. 10.1093\/nar\/gkh036","journal-title":"Nucl Acids Res"},{"key":"307_CR27","unstructured":"MarineGenomics: Marine Genomics website.[http:\/\/marinegenomics.org]"},{"key":"307_CR28","doi-asserted-by":"publisher","first-page":"166","DOI":"10.1038\/ng1165","volume":"34","author":"E Segal","year":"2003","unstructured":"Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman N: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data.\n                           Nat Genet 2003, 34: 166\u2013176.","journal-title":"Nat Genet"},{"key":"307_CR29","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1186\/1471-2105-5-18","volume":"5","author":"D Allocco","year":"2004","unstructured":"Allocco D, Kohane I, Butte A: Quantifying the relationship between co-expression, co-regulation and gene function.\n                           BMC Bioinformatics 2004, 5: 18. 10.1186\/1471-2105-5-18","journal-title":"BMC Bioinformatics"},{"key":"307_CR30","doi-asserted-by":"publisher","first-page":"W471","DOI":"10.1093\/nar\/gkh452","volume":"32","author":"FM Roche","year":"2004","unstructured":"Roche FM, Hokamp K, Acab M, Babiuk LA, Hancock REW, Brinkman FSL: ProbeLynx: a tool for updating the association of microarray probes to genes.\n                           Nucl Acids Res 2004, 32: W471\u2013474. 10.1093\/nar\/gkh123","journal-title":"Nucl Acids Res"},{"key":"307_CR31","doi-asserted-by":"crossref","first-page":"620","DOI":"10.2144\/02323pf01","volume":"32","author":"NA Miller","year":"2002","unstructured":"Miller NA, Gong Q, Bryan R, Ruvolo M, Turner LA, LaBrie ST: Cross-hybridization of closely related genes on high-density macroarrays.\n                           Biotechniques 2002, 32: 620\u2013625.","journal-title":"Biotechniques"},{"key":"307_CR32","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1016\/S0378-1119(01)00516-9","volume":"272","author":"W Xu","year":"2001","unstructured":"Xu W, Bak S, Decker A, Paquette SM, Feyereisen R, Galbraith DW: Microarray-based analysis of gene expression in very large gene families: the cytochrome P450 gene superfamily of Arabidopsis thaliana.\n                           Gene 2001, 272: 61\u201374. 10.1016\/S0378-1119(01)00516-9","journal-title":"Gene"},{"key":"307_CR33","doi-asserted-by":"crossref","first-page":"1182","DOI":"10.2144\/01315dd03","volume":"31","author":"EM Evertsz","year":"2001","unstructured":"Evertsz EM, Au-Young J, Ruvolo MV, Lim AC, Reynolds MA: Hybridization cross-reactivity within homologous gene families on glass cDNA microarrays.\n                           Biotechniques 2001, 31: 1182\u20131192.","journal-title":"Biotechniques"},{"key":"307_CR34","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1007\/s00251-002-0487-z","volume":"54","author":"BJ Cuthbertson","year":"2002","unstructured":"Cuthbertson BJ, Shepard EF, Chapman RW, Gross PS: Diversity of the penaeidin antimicrobial peptides in two shrimp species.\n                           Immunogenetics 2002, 54: 442 -4445. 10.1007\/s00251-002-0487-z","journal-title":"Immunogenetics"},{"key":"307_CR35","volume-title":"HMMER","author":"S Eddy","year":"2003","unstructured":"Eddy S: HMMER.2.3.2 edition. , [http:\/\/hmmer.wustl.edu\/] http:\/\/hmmer.wustl.edu\/; 2003.","edition":"2.3.2"},{"key":"307_CR36","doi-asserted-by":"publisher","first-page":"260","DOI":"10.1093\/nar\/27.1.260","volume":"27","author":"A Bateman","year":"1999","unstructured":"Bateman A, Birney E, Durbin R, Eddy SR, Finn RD, Sonnhammer EL: Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins.\n                           Nucl Acids Res 1999, 27: 260\u2013262. 10.1093\/nar\/27.1.260","journal-title":"Nucl Acids Res"},{"key":"307_CR37","doi-asserted-by":"publisher","first-page":"D138","DOI":"10.1093\/nar\/gkh121","volume":"32","author":"A Bateman","year":"2004","unstructured":"Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer ELL, Studholme DJ, Yeats C, Eddy SR: The Pfam protein families database.\n                           Nucl Acids Res 2004, 32: D138\u2013141. 10.1093\/nar\/gkh121","journal-title":"Nucl Acids Res"},{"key":"307_CR38","first-page":"67","volume":"10","author":"JM Hancock","year":"1994","unstructured":"Hancock JM, Armstrong JS: SIMPLE34: an improved and enhanced implementation for VAX and Sun computers of the SIMPLE algorithm for analysis of clustered repetitive motifs in nucleotide sequences.\n                           Comput Appl Biosci 1994, 10: 67\u201370.","journal-title":"Comput Appl Biosci"},{"key":"307_CR39","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","volume":"215","author":"SF Altschul","year":"1990","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool.\n                           J Mol Biol 1990, 215: 403\u2013410. 10.1006\/jmbi.1990.9999","journal-title":"J Mol Biol"},{"key":"307_CR40","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1038\/75556","volume":"25","author":"M Ashburner","year":"2000","unstructured":"Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson Je, Ringwald M, Rubin GM, Sherlock G: Gene Ontology: tool for the unification of biology.\n                           Nature Genetics 2000, 25: 25\u201329. 10.1038\/75556","journal-title":"Nature Genetics"},{"key":"307_CR41","unstructured":"Gene Ontology website[http:\/\/www.geneontology.org]"},{"key":"307_CR42","unstructured":"mySQL website[http:\/\/www.mysql.com]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-5-191.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/1471-2105-5-191\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-5-191.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,7]],"date-time":"2024-10-07T12:21:41Z","timestamp":1728303701000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-5-191"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,12,7]]},"references-count":42,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2004,12]]}},"alternative-id":["307"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-5-191","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2004,12,7]]},"assertion":[{"value":"21 August 2004","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 December 2004","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 December 2004","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"191"}}