{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,9]],"date-time":"2024-09-09T01:11:45Z","timestamp":1725844305150},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>High-throughput technologies like functional screens and gene expression analysis produce extended lists of candidate genes. Gene-Set Enrichment Analysis is a commonly used and well established technique to test for the statistically significant over-representation of particular pathways. A shortcoming of this method is however, that most genes that are investigated in the experiments have very sparse functional or pathway annotation and therefore cannot be the target of such an analysis. The approach presented here aims to assign lists of genes with limited annotation to previously described functional gene collections or pathways. This works by comparing InterPro domain signatures of the candidate gene lists with domain signatures of gene sets derived from known classifications, e.g. KEGG pathways.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>In order to validate our approach, we designed a simulation study. Based on all pathways available in the KEGG database, we create test gene lists by randomly selecting pathway genes, removing these genes from the known pathways and adding variable amounts of noise in the form of genes not annotated to the pathway. We show that we can recover pathway memberships based on the simulated gene lists with high accuracy. We further demonstrate the applicability of our approach on a biological example.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>Results based on simulation and data analysis show that domain based pathway enrichment analysis is a very sensitive method to test for enrichment of pathways in sparsely annotated lists of genes. An R based software package <jats:italic>domainsignatures<\/jats:italic>, to routinely perform this analysis on the results of high-throughput screening, is available via Bioconductor.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-9-3","type":"journal-article","created":{"date-parts":[[2008,1,4]],"date-time":"2008-01-04T19:13:25Z","timestamp":1199474005000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Extending pathways based on gene lists using InterPro domain signatures"],"prefix":"10.1186","volume":"9","author":[{"given":"Florian","family":"Hahne","sequence":"first","affiliation":[]},{"given":"Alexander","family":"Mehrle","sequence":"additional","affiliation":[]},{"given":"Dorit","family":"Arlt","sequence":"additional","affiliation":[]},{"given":"Annemarie","family":"Poustka","sequence":"additional","affiliation":[]},{"given":"Stefan","family":"Wiemann","sequence":"additional","affiliation":[]},{"given":"Tim","family":"Beissbarth","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2008,1,4]]},"reference":[{"key":"1988_CR1","doi-asserted-by":"publisher","first-page":"340","DOI":"10.1016\/S0076-6879(06)11018-6","volume":"411","author":"T Beissbarth","year":"2006","unstructured":"Beissbarth T: Interpreting experimental results using gene ontologies. Methods Enzymol 2006, 411: 340\u2013352. 10.1016\/S0076-6879(06)11018-6","journal-title":"Methods Enzymol"},{"issue":"9","key":"1988_CR2","doi-asserted-by":"publisher","first-page":"1464","DOI":"10.1093\/bioinformatics\/bth088","volume":"20","author":"T Beissbarth","year":"2004","unstructured":"Beissbarth T, Speed TP: GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 2004, 20(9):1464\u20131465. 10.1093\/bioinformatics\/bth088","journal-title":"Bioinformatics"},{"issue":"13","key":"1988_CR3","doi-asserted-by":"publisher","first-page":"1600","DOI":"10.1093\/bioinformatics\/btl140","volume":"22","author":"A Alexa","year":"2006","unstructured":"Alexa A, Rahnenfuehrer J, Lengauer T: Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 2006, 22(13):1600\u20131607. 10.1093\/bioinformatics\/btl140","journal-title":"Bioinformatics"},{"issue":"20","key":"1988_CR4","doi-asserted-by":"publisher","first-page":"2500","DOI":"10.1093\/bioinformatics\/btl424","volume":"22","author":"T Manoli","year":"2006","unstructured":"Manoli T, Gretz N, Groene HJ, Kenzelmann M, Eils R, Brors B: Group testing for pathway analysis improves comparability of different microarray datasets. Bioinformatics 2006, 22(20):2500\u20132506. 10.1093\/bioinformatics\/btl424","journal-title":"Bioinformatics"},{"issue":"Web Server issu","key":"1988_CR5","doi-asserted-by":"publisher","first-page":"W91","DOI":"10.1093\/nar\/gkm260","volume":"35","author":"F Al-Shahrour","year":"2007","unstructured":"Al-Shahrour F, Minguez P, T\u00e1rraga J, Medina I, Alloza E, Montaner D, Dopazo J: FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. Nucleic Acids Res 2007, 35(Web Server issue):W91-W96. 10.1093\/nar\/gkm260","journal-title":"Nucleic Acids Res"},{"key":"1988_CR6","doi-asserted-by":"publisher","first-page":"166","DOI":"10.1186\/1471-2105-8-166","volume":"8","author":"H Froehlich","year":"2007","unstructured":"Froehlich H, Speer N, Poustka A, Beissbarth T: GOSim \u2013 An R-package for computation of information theoretic GO similarities between terms and gene products. BMC Bioinformatics 2007, 8: 166. 10.1186\/1471-2105-8-166","journal-title":"BMC Bioinformatics"},{"key":"1988_CR7","doi-asserted-by":"publisher","first-page":"386","DOI":"10.1186\/1471-2105-8-386","volume":"8","author":"H Froehlich","year":"2007","unstructured":"Froehlich H, Fellmann M, Sueltmann H, Poustka A, Beissbarth T: Large scale statistical inference of signaling pathways from RNAi and microarray data. BMC Bioinformatics 2007, 8: 386. 10.1186\/1471-2105-8-386","journal-title":"BMC Bioinformatics"},{"issue":"9","key":"1988_CR8","doi-asserted-by":"publisher","first-page":"1217","DOI":"10.1089\/cmb.2007.0085","volume":"14","author":"A Tresch","year":"2007","unstructured":"Tresch A, Beissbarth T, Sueltmann H, Kuner R, Poustka A, Buness A: Discrimination of direct and indirect interactions in a network of regulatory effects. J Comput Biol 2007, 14(9):1217\u20131228. 10.1089\/cmb.2007.0085","journal-title":"J Comput Biol"},{"key":"1988_CR9","doi-asserted-by":"crossref","unstructured":"Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M: The KEGG resource for deciphering the genome. Nucleic Acids Res 2004, (32 Database):D277-D280. 10.1093\/nar\/gkh063","DOI":"10.1093\/nar\/gkh063"},{"issue":"7","key":"1988_CR10","doi-asserted-by":"publisher","first-page":"1985","DOI":"10.1002\/pmic.200300721","volume":"4","author":"PJ Kersey","year":"2004","unstructured":"Kersey PJ, Duarte J, Williams A, Karavidopoulou Y, Birney E, Apweiler R: The International Protein Index: an integrated database for proteomics experiments. Proteomics 2004, 4(7):1985\u20131988. 10.1002\/pmic.200300721","journal-title":"Proteomics"},{"key":"1988_CR11","doi-asserted-by":"crossref","unstructured":"Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 2007, (35 Database):D26-D31. 10.1093\/nar\/gkl993","DOI":"10.1093\/nar\/gkl993"},{"key":"1988_CR12","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1016\/j.cbpa.2006.11.039","volume":"11","author":"PD Thomas","year":"2007","unstructured":"Thomas PD, Mi H, Lewis S: Ontology annotation: mapping genomic regions to biological function. Curr Opin Chem Biol 2007, 11: 4\u201311. 10.1016\/j.cbpa.2006.11.039","journal-title":"Curr Opin Chem Biol"},{"key":"1988_CR13","series-title":"Nucleic Acids Res","first-page":"D224","volume-title":"New developments in the InterPro database.","author":"NJ Mulder","year":"2007","unstructured":"Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Langendijk-Genevaux PS, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Nikolskaya AN, Orchard S, Orengo C, Petryszak R, Selengut JD, Sigrist CJA, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C: New developments in the InterPro database. Nucleic Acids Res 2007, (35 Database):D224-D228. 10.1093\/nar\/gkl841"},{"key":"1988_CR14","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1196\/annals.1310.009","volume":"1020","author":"CF Schaefer","year":"2004","unstructured":"Schaefer CF: Pathway databases. Ann N Y Acad Sci 2004, 1020: 77\u201391. 10.1196\/annals.1310.009","journal-title":"Ann N Y Acad Sci"},{"issue":"3","key":"1988_CR15","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1016\/j.chembiol.2004.11.020","volume":"12","author":"R Raman","year":"2005","unstructured":"Raman R, Sasisekharan V, Sasisekharan R: Structural insights into biological roles of protein-glycosaminoglycan interactions. Chem Biol 2005, 12(3):267\u2013277. 10.1016\/j.chembiol.2004.11.020","journal-title":"Chem Biol"},{"issue":"3","key":"1988_CR16","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1016\/j.critrevonc.2003.10.005","volume":"49","author":"Y Wegrowski","year":"2004","unstructured":"Wegrowski Y, Maquart FX: Involvement of stromal proteoglycans in tumour progression. Crit Rev Oncol Hematol 2004, 49(3):259\u2013268. 10.1016\/j.critrevonc.2003.10.005","journal-title":"Crit Rev Oncol Hematol"},{"issue":"Pt 1","key":"1988_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1042\/bj20021228","volume":"368","author":"MD Bass","year":"2002","unstructured":"Bass MD, Humphries MJ: Cytoplasmic interactions of syndecan-4 orchestrate adhesion receptor and growth factor receptor signalling. Biochem J 2002, 368(Pt 1):1\u201315. 10.1042\/BJ20021228","journal-title":"Biochem J"},{"issue":"3","key":"1988_CR18","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1016\/S1044-579X(02)00021-4","volume":"12","author":"J Timar","year":"2002","unstructured":"Timar J, Lapis K, Dudis J, Sebestyen A, Kopper L, Kovalszky I: Proteoglycans and tumor progression: Janus-faced molecules with contradictory functions in cancer. Semin Cancer Biol 2002, 12(3):173\u2013186. 10.1016\/S1044-579X(02)00021-4","journal-title":"Semin Cancer Biol"},{"key":"1988_CR19","unstructured":"Bioconductor[http:\/\/www.bioconductor.org]"},{"key":"1988_CR20","unstructured":"Kyoto Encyclopedia of Genes and Genomes[http:\/\/www.genome.jp\/kegg\/]"},{"key":"1988_CR21","unstructured":"International Protein Index[http:\/\/www.ebi.ac.uk\/IPI]"},{"key":"1988_CR22","unstructured":"Entrez Gene[http:\/\/www.ncbi.nlm.nih.gov\/entrez\/query.fcgi?db=gene]"},{"issue":"16","key":"1988_CR23","doi-asserted-by":"publisher","first-page":"3439","DOI":"10.1093\/bioinformatics\/bti525","volume":"21","author":"S Durinck","year":"2005","unstructured":"Durinck S, Moreau Y, Kasprzyk A, Davis S, Moor BD, Brazma A, Huber W: BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 2005, 21(16):3439\u20133440. 10.1093\/bioinformatics\/bti525","journal-title":"Bioinformatics"},{"key":"1988_CR24","unstructured":"Ensembl Genome Database[http:\/\/www.ensembl.org]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T03:14:29Z","timestamp":1630466069000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,1,4]]},"references-count":24,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["1988"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,1,4]]},"assertion":[{"value":"16 July 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 January 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 January 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"3"}}