{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,26]],"date-time":"2025-10-26T22:49:37Z","timestamp":1761518977847,"version":"3.37.3"},"reference-count":85,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2010,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Semantic similarity scores for protein pairs are widely applied in functional genomic researches for finding functional clusters of proteins, predicting protein functions and protein-protein interactions, and for identifying putative disease genes. However, because some proteins, such as those related to diseases, tend to be studied more intensively, annotations are likely to be biased, which may affect applications based on semantic similarity measures. Thus, it is necessary to evaluate the effects of the bias on semantic similarity scores between proteins and then find a method to avoid them.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>First, we evaluated 14 commonly used semantic similarity scores for protein pairs and demonstrated that they significantly correlated with the numbers of annotation terms for the proteins (also known as the protein annotation length). These results suggested that current applications of the semantic similarity scores between proteins might be unreliable. Then, to reduce this annotation bias effect, we proposed normalizing the semantic similarity scores between proteins using the power transformation of the scores. We provide evidence that this improves performance in some applications.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Current semantic similarity measures for protein pairs are highly dependent on protein annotation lengths, which are subject to biological research bias. This affects applications that are based on these semantic similarity scores, especially in clustering studies that rely on score magnitudes. The normalized scores proposed in this paper can reduce the effects of this bias to some extent.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-11-290","type":"journal-article","created":{"date-parts":[[2010,5,28]],"date-time":"2010-05-28T18:15:49Z","timestamp":1275070549000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":37,"title":["Revealing and avoiding bias in semantic similarity scores for protein pairs"],"prefix":"10.1186","volume":"11","author":[{"given":"Jing","family":"Wang","sequence":"first","affiliation":[]},{"given":"Xianxiao","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Jing","family":"Zhu","sequence":"additional","affiliation":[]},{"given":"Chenggui","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Zheng","family":"Guo","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2010,5,28]]},"reference":[{"issue":"1","key":"3747_CR1","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1038\/75556","volume":"25","author":"M Ashburner","year":"2000","unstructured":"Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000, 25(1):25\u201329. 10.1038\/75556","journal-title":"Nat Genet"},{"issue":"10","key":"3747_CR2","doi-asserted-by":"publisher","first-page":"1275","DOI":"10.1093\/bioinformatics\/btg153","volume":"19","author":"PW Lord","year":"2003","unstructured":"Lord PW, Stevens RD, Brass A, Goble CA: Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 2003, 19(10):1275\u20131283. 10.1093\/bioinformatics\/btg153","journal-title":"Bioinformatics"},{"key":"3747_CR3","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1177\/117693430600200017","volume":"2","author":"L Marino-Ramirez","year":"2006","unstructured":"Marino-Ramirez L, Bodenreider O, Kantz N, Jordan IK: Co-evolutionary Rates of Functionally Related Yeast Genes. Evol Bioinform Online 2006, 2: 295\u2013300.","journal-title":"Evol Bioinform Online"},{"key":"3747_CR4","doi-asserted-by":"publisher","first-page":"302","DOI":"10.1186\/1471-2105-7-302","volume":"7","author":"A Schlicker","year":"2006","unstructured":"Schlicker A, Domingues FS, Rahnenfuhrer J, Lengauer T: A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics 2006, 7: 302. 10.1186\/1471-2105-7-302","journal-title":"BMC Bioinformatics"},{"key":"3747_CR5","first-page":"296","volume-title":"Proc 15th International Conf on Machine Learning: 1998","author":"D Lin","year":"1998","unstructured":"Lin D: An information-theoretic definition of similarity. Proc 15th International Conf on Machine Learning: 1998 1998, 296\u2013304."},{"key":"3747_CR6","first-page":"448","volume-title":"Proceedings of the 14th International Joint Conference on Artificial Intelligence: 1995","author":"P Resnik","year":"1995","unstructured":"Resnik P: Using information content to evaluate semantic similarity in a taxonomy. Proceedings of the 14th International Joint Conference on Artificial Intelligence: 1995 1995, 448\u2013453."},{"issue":"1","key":"3747_CR7","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1186\/1756-0381-1-11","volume":"1","author":"K Ovaska","year":"2008","unstructured":"Ovaska K, Laakso M, Hautaniemi S: Fast Gene Ontology based clustering for microarray experiments. BioData Min 2008, 1(1):11. 10.1186\/1756-0381-1-11","journal-title":"BioData Min"},{"issue":"Suppl 5","key":"3747_CR8","doi-asserted-by":"publisher","first-page":"S4","DOI":"10.1186\/1471-2105-9-S5-S4","volume":"9","author":"C Pesquita","year":"2008","unstructured":"Pesquita C, Faria D, Bastos H, Ferreira AE, Falcao AO, Couto FM: Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatics 2008, 9(Suppl 5):S4. 10.1186\/1471-2105-9-S5-S4","journal-title":"BMC Bioinformatics"},{"key":"3747_CR9","doi-asserted-by":"publisher","first-page":"327","DOI":"10.1186\/1471-2105-9-327","volume":"9","author":"M Mistry","year":"2008","unstructured":"Mistry M, Pavlidis P: Gene Ontology term overlap as a measure of gene functional similarity. BMC Bioinformatics 2008, 9: 327. 10.1186\/1471-2105-9-327","journal-title":"BMC Bioinformatics"},{"key":"3747_CR10","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1186\/1471-2105-8-235","volume":"8","author":"J Chabalier","year":"2007","unstructured":"Chabalier J, Mosser J, Burgun A: A transversal approach to predict gene product networks from ontology-based similarity. BMC Bioinformatics 2007, 8: 235. 10.1186\/1471-2105-8-235","journal-title":"BMC Bioinformatics"},{"issue":"9","key":"3747_CR11","doi-asserted-by":"publisher","first-page":"R183","DOI":"10.1186\/gb-2007-8-9-r183","volume":"8","author":"W Huang da","year":"2007","unstructured":"Huang da W, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, Stephens R, Baseler MW, Lane HC, Lempicki RA: The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol 2007, 8(9):R183. 10.1186\/gb-2007-8-9-r183","journal-title":"Genome Biol"},{"issue":"12","key":"3747_CR12","doi-asserted-by":"publisher","first-page":"R101","DOI":"10.1186\/gb-2004-5-12-r101","volume":"5","author":"D Martin","year":"2004","unstructured":"Martin D, Brun C, Remy E, Mouren P, Thieffry D, Jacq B: GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol 2004, 5(12):R101. 10.1186\/gb-2004-5-12-r101","journal-title":"Genome Biol"},{"issue":"7","key":"3747_CR13","doi-asserted-by":"publisher","first-page":"e1000443","DOI":"10.1371\/journal.pcbi.1000443","volume":"5","author":"C Pesquita","year":"2009","unstructured":"Pesquita C, Faria D, Falcao AO, Lord P, Couto FM: Semantic similarity in biomedical ontologies. PLoS Comput Biol 2009, 5(7):e1000443. 10.1371\/journal.pcbi.1000443","journal-title":"PLoS Comput Biol"},{"key":"3747_CR14","doi-asserted-by":"publisher","first-page":"222","DOI":"10.1186\/1471-2164-8-222","volume":"8","author":"T Joshi","year":"2007","unstructured":"Joshi T, Xu D: Quantitative assessment of relationship between sequence similarity and function similarity. BMC Genomics 2007, 8: 222. 10.1186\/1471-2164-8-222","journal-title":"BMC Genomics"},{"key":"3747_CR15","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1186\/1471-2148-9-55","volume":"9","author":"L Yang","year":"2009","unstructured":"Yang L, Yu J: A comparative analysis of divergently-paired genes (DPGs) among Drosophila and vertebrate genomes. BMC Evol Biol 2009, 9: 55. 10.1186\/1471-2148-9-55","journal-title":"BMC Evol Biol"},{"issue":"1","key":"3747_CR16","doi-asserted-by":"publisher","first-page":"e1000262","DOI":"10.1371\/journal.pcbi.1000262","volume":"5","author":"AM Altenhoff","year":"2009","unstructured":"Altenhoff AM, Dessimoz C: Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput Biol 2009, 5(1):e1000262. 10.1371\/journal.pcbi.1000262","journal-title":"PLoS Comput Biol"},{"issue":"16","key":"3747_CR17","doi-asserted-by":"publisher","first-page":"2096","DOI":"10.1093\/bioinformatics\/btm309","volume":"23","author":"LL Elo","year":"2007","unstructured":"Elo LL, Jarvenpaa H, Oresic M, Lahesmaa R, Aittokallio T: Systematic construction of gene coexpression networks with applications to human T helper cell differentiation process. Bioinformatics 2007, 23(16):2096\u20132103. 10.1093\/bioinformatics\/btm309","journal-title":"Bioinformatics"},{"issue":"6","key":"3747_CR18","doi-asserted-by":"publisher","first-page":"1085","DOI":"10.1101\/gr.1910904","volume":"14","author":"HK Lee","year":"2004","unstructured":"Lee HK, Hsu AK, Sajdak J, Qin J, Pavlidis P: Coexpression analysis of human genes across many microarray data sets. Genome Res 2004, 14(6):1085\u20131094. 10.1101\/gr.1910904","journal-title":"Genome Res"},{"key":"3747_CR19","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1109\/CIBCB.2004.1393927","volume-title":"Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'04: 2004","author":"H Wang","year":"2004","unstructured":"Wang H, Azuaje F, Bodenreider O, Dopazo J: Gene expression correlation and gene ontology-based similarity: an assessment of quantitative relationships. Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'04: 2004 2004, 25\u201331."},{"issue":"Suppl 3","key":"3747_CR20","doi-asserted-by":"publisher","first-page":"S7","DOI":"10.1186\/1471-2105-8-S3-S7","volume":"8","author":"JL Chen","year":"2007","unstructured":"Chen JL, Liu Y, Sam LT, Li J, Lussier YA: Evaluation of high-throughput functional categorization of human disease genes. BMC Bioinformatics 2007, 8(Suppl 3):S7. 10.1186\/1471-2105-8-S3-S7","journal-title":"BMC Bioinformatics"},{"key":"3747_CR21","doi-asserted-by":"crossref","unstructured":"Du Z, Li L, Chen CF, Yu PS, Wang JZ: G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery. Nucleic Acids Res 2009, (37 Web Server):W345\u2013349. 10.1093\/nar\/gkp463","DOI":"10.1093\/nar\/gkp463"},{"issue":"10","key":"3747_CR22","doi-asserted-by":"publisher","first-page":"1274","DOI":"10.1093\/bioinformatics\/btm087","volume":"23","author":"JZ Wang","year":"2007","unstructured":"Wang JZ, Du Z, Payattakool R, Yu PS, Chen CF: A new method to measure the semantic similarity of GO terms. Bioinformatics 2007, 23(10):1274\u20131281. 10.1093\/bioinformatics\/btm087","journal-title":"Bioinformatics"},{"key":"3747_CR23","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1038\/msb.2008.42","volume":"4","author":"I Ulitsky","year":"2008","unstructured":"Ulitsky I, Shlomi T, Kupiec M, Shamir R: From E-MAPs to module maps: dissecting quantitative genetic interactions using physical interactions. Mol Syst Biol 2008, 4: 209. 10.1038\/msb.2008.42","journal-title":"Mol Syst Biol"},{"issue":"4","key":"3747_CR24","doi-asserted-by":"publisher","first-page":"e1000065","DOI":"10.1371\/journal.pcbi.1000065","volume":"4","author":"S Bandyopadhyay","year":"2008","unstructured":"Bandyopadhyay S, Kelley R, Krogan NJ, Ideker T: Functional maps of protein complexes from quantitative genetic interaction data. PLoS Comput Biol 2008, 4(4):e1000065. 10.1371\/journal.pcbi.1000065","journal-title":"PLoS Comput Biol"},{"key":"3747_CR25","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1186\/1471-2105-7-355","volume":"7","author":"TZ Sen","year":"2006","unstructured":"Sen TZ, Kloczkowski A, Jernigan RL: Functional clustering of yeast proteins from the protein-protein interaction network. BMC Bioinformatics 2006, 7: 355. 10.1186\/1471-2105-7-355","journal-title":"BMC Bioinformatics"},{"issue":"4","key":"3747_CR26","doi-asserted-by":"publisher","first-page":"948","DOI":"10.1002\/prot.21071","volume":"64","author":"Z Lubovac","year":"2006","unstructured":"Lubovac Z, Gamalielsson J, Olsson B: Combining functional and topological properties to identify core modules in protein interaction networks. Proteins 2006, 64(4):948\u2013959. 10.1002\/prot.21071","journal-title":"Proteins"},{"issue":"7084","key":"3747_CR27","doi-asserted-by":"publisher","first-page":"637","DOI":"10.1038\/nature04670","volume":"440","author":"NJ Krogan","year":"2006","unstructured":"Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, et al.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 2006, 440(7084):637\u2013643. 10.1038\/nature04670","journal-title":"Nature"},{"issue":"2","key":"3747_CR28","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1038\/nbt.1522","volume":"27","author":"IW Taylor","year":"2009","unstructured":"Taylor IW, Linding R, Warde-Farley D, Liu Y, Pesquita C, Faria D, Bull S, Pawson T, Morris Q, Wrana JL: Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nat Biotechnol 2009, 27(2):199\u2013204. 10.1038\/nbt.1522","journal-title":"Nat Biotechnol"},{"issue":"2","key":"3747_CR29","doi-asserted-by":"publisher","first-page":"e1562","DOI":"10.1371\/journal.pone.0001562","volume":"3","author":"XW Chen","year":"2008","unstructured":"Chen XW, Liu M, Ward R: Protein function assignment through mining cross-species protein-protein interactions. PLoS ONE 2008, 3(2):e1562. 10.1371\/journal.pone.0001562","journal-title":"PLoS ONE"},{"issue":"1-2","key":"3747_CR30","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1016\/j.gene.2006.12.008","volume":"391","author":"M Zhu","year":"2007","unstructured":"Zhu M, Gao L, Guo Z, Li Y, Wang D, Wang J, Wang C: Globally predicting protein functions based on co-expressed protein-protein interaction networks and ontology taxonomy similarities. Gene 2007, 391(1\u20132):113\u2013119. 10.1016\/j.gene.2006.12.008","journal-title":"Gene"},{"issue":"13","key":"3747_CR31","doi-asserted-by":"publisher","first-page":"i529","DOI":"10.1093\/bioinformatics\/btm195","volume":"23","author":"Y Tao","year":"2007","unstructured":"Tao Y, Sam L, Li J, Friedman C, Lussier YA: Information theory applied to the sparse gene ontology annotation network to predict novel gene function. Bioinformatics 2007, 23(13):i529\u2013538. 10.1093\/bioinformatics\/btm195","journal-title":"Bioinformatics"},{"issue":"6","key":"3747_CR32","doi-asserted-by":"publisher","first-page":"922","DOI":"10.1016\/j.ygeno.2004.08.005","volume":"84","author":"K Tu","year":"2004","unstructured":"Tu K, Yu H, Guo Z, Li X: Learnability-based further prediction of gene functions in Gene Ontology. Genomics 2004, 84(6):922\u2013928. 10.1016\/j.ygeno.2004.08.005","journal-title":"Genomics"},{"key":"3747_CR33","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1186\/1471-2105-9-143","volume":"9","author":"A Cakmak","year":"2008","unstructured":"Cakmak A, Ozsoyoglu G: Discovering gene annotations in biomedical text databases. BMC Bioinformatics 2008, 9: 143. 10.1186\/1471-2105-9-143","journal-title":"BMC Bioinformatics"},{"key":"3747_CR34","doi-asserted-by":"publisher","first-page":"382","DOI":"10.1186\/1471-2105-9-382","volume":"9","author":"YR Cho","year":"2008","unstructured":"Cho YR, Shi L, Ramanathan M, Zhang A: A probabilistic framework to predict protein function from interaction data integrated with semantic knowledge. BMC Bioinformatics 2008, 9: 382. 10.1186\/1471-2105-9-382","journal-title":"BMC Bioinformatics"},{"issue":"2","key":"3747_CR35","doi-asserted-by":"publisher","first-page":"e4619","DOI":"10.1371\/journal.pone.0004619","volume":"4","author":"P Fontana","year":"2009","unstructured":"Fontana P, Cestaro A, Velasco R, Formentin E, Toppo S: Rapid annotation of anonymous sequences from genome projects using semantic similarities and a weighting scheme in gene ontology. PLoS ONE 2009, 4(2):e4619. 10.1371\/journal.pone.0004619","journal-title":"PLoS ONE"},{"issue":"5","key":"3747_CR36","doi-asserted-by":"publisher","first-page":"605","DOI":"10.1093\/bioinformatics\/btl683","volume":"23","author":"ME Futschik","year":"2007","unstructured":"Futschik ME, Chaurasia G, Herzel H: Comparison of human protein-protein interaction maps. Bioinformatics 2007, 23(5):605\u2013611. 10.1093\/bioinformatics\/btl683","journal-title":"Bioinformatics"},{"issue":"7","key":"3747_CR37","doi-asserted-by":"publisher","first-page":"2137","DOI":"10.1093\/nar\/gkl219","volume":"34","author":"X Wu","year":"2006","unstructured":"Wu X, Zhu L, Guo J, Zhang DY, Lin K: Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Res 2006, 34(7):2137\u20132150. 10.1093\/nar\/gkl219","journal-title":"Nucleic Acids Res"},{"issue":"14","key":"3747_CR38","doi-asserted-by":"publisher","first-page":"e402","DOI":"10.1093\/bioinformatics\/btl258","volume":"22","author":"Y Ofran","year":"2006","unstructured":"Ofran Y, Yachdav G, Mozes E, Soong TT, Nair R, Rost B: Create and assess protein networks through molecular characteristics of individual proteins. Bioinformatics 2006, 22(14):e402\u2013407. 10.1093\/bioinformatics\/btl258","journal-title":"Bioinformatics"},{"issue":"5882","key":"3747_CR39","doi-asserted-by":"publisher","first-page":"1465","DOI":"10.1126\/science.1153878","volume":"320","author":"K Tarassov","year":"2008","unstructured":"Tarassov K, Messier V, Landry CR, Radinovic S, Serna Molina MM, Shames I, Malitskaya Y, Vogel J, Bussey H, Michnick SW: An in vivo map of the yeast protein interactome. Science 2008, 320(5882):1465\u20131470. 10.1126\/science.1153878","journal-title":"Science"},{"key":"3747_CR40","doi-asserted-by":"publisher","first-page":"472","DOI":"10.1186\/1471-2105-9-472","volume":"9","author":"T Xu","year":"2008","unstructured":"Xu T, Du L, Zhou Y: Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data. BMC Bioinformatics 2008, 9: 472. 10.1186\/1471-2105-9-472","journal-title":"BMC Bioinformatics"},{"issue":"22","key":"3747_CR41","doi-asserted-by":"publisher","first-page":"2608","DOI":"10.1093\/bioinformatics\/btn498","volume":"24","author":"TT Soong","year":"2008","unstructured":"Soong TT, Wrzeszczynski KO, Rost B: Physical protein-protein interactions predicted from microarrays. Bioinformatics 2008, 24(22):2608\u20132614. 10.1093\/bioinformatics\/btn498","journal-title":"Bioinformatics"},{"issue":"9","key":"3747_CR42","doi-asserted-by":"publisher","first-page":"1132","DOI":"10.1093\/bioinformatics\/btm001","volume":"23","author":"KJ Gaulton","year":"2007","unstructured":"Gaulton KJ, Mohlke KL, Vision TJ: A computational system to select candidate genes for complex human traits. Bioinformatics 2007, 23(9):1132\u20131140. 10.1093\/bioinformatics\/btm001","journal-title":"Bioinformatics"},{"issue":"6","key":"3747_CR43","doi-asserted-by":"publisher","first-page":"773","DOI":"10.1093\/bioinformatics\/btk031","volume":"22","author":"EA Adie","year":"2006","unstructured":"Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 2006, 22(6):773\u2013774. 10.1093\/bioinformatics\/btk031","journal-title":"Bioinformatics"},{"key":"3747_CR44","doi-asserted-by":"crossref","unstructured":"Chen J, Bardes EE, Aronow BJ, Jegga AG: ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res 2009, (37 Web Server):W305\u2013311. 10.1093\/nar\/gkp427","DOI":"10.1093\/nar\/gkp427"},{"issue":"Suppl 2","key":"3747_CR45","doi-asserted-by":"crossref","first-page":"S110","DOI":"10.1093\/bioinformatics\/18.suppl_2.S110","volume":"18","author":"J Freudenberg","year":"2002","unstructured":"Freudenberg J, Propping P: A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics 2002, 18(Suppl 2):S110\u2013115.","journal-title":"Bioinformatics"},{"issue":"7","key":"3747_CR46","doi-asserted-by":"publisher","first-page":"509","DOI":"10.1038\/nrg2363","volume":"9","author":"SY Rhee","year":"2008","unstructured":"Rhee SY, Wood V, Dolinski K, Draghici S: Use and misuse of the gene ontology annotations. Nat Rev Genet 2008, 9(7):509\u2013515. 10.1038\/nrg2363","journal-title":"Nat Rev Genet"},{"key":"3747_CR47","first-page":"1","volume":"00","author":"O Verver","year":"2007","unstructured":"Verver O, Ridder Jd, Reinders MJT, Wessels LFA: Prioritization of Candidate Disease Genes using Microarray Data and Functional Relations. Bioinformatics 2007, 00: 1\u201312.","journal-title":"Bioinformatics"},{"key":"3747_CR48","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1186\/1471-2105-6-55","volume":"6","author":"EA Adie","year":"2005","unstructured":"Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS: Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics 2005, 6: 55. 10.1186\/1471-2105-6-55","journal-title":"BMC Bioinformatics"},{"key":"3747_CR49","doi-asserted-by":"crossref","unstructured":"Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2005, (33 Database):D514\u2013517.","DOI":"10.1093\/nar\/gki033"},{"issue":"21","key":"3747_CR50","doi-asserted-by":"publisher","first-page":"8685","DOI":"10.1073\/pnas.0701361104","volume":"104","author":"KI Goh","year":"2007","unstructured":"Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL: The human disease network. Proc Natl Acad Sci USA 2007, 104(21):8685\u20138690. 10.1073\/pnas.0701361104","journal-title":"Proc Natl Acad Sci USA"},{"key":"3747_CR51","volume-title":"8th Spanish Symposium on Bioinformatics and Computational Biology: 2008","author":"M Chagoyen","year":"2008","unstructured":"Chagoyen M, Carazo J, Pascual-Montano A: Pairwise similarity scores using functional annotations: review and comparison. 8th Spanish Symposium on Bioinformatics and Computational Biology: 2008 2008."},{"key":"3747_CR52","doi-asserted-by":"publisher","first-page":"72","DOI":"10.2307\/1412159","volume":"15","author":"C Spearman","year":"1904","unstructured":"Spearman C: The Proof and Measurement of Association Between Two Things. American Journal of Psychology 1904, 15: 72\u2013101. 10.2307\/1412159","journal-title":"American Journal of Psychology"},{"issue":"3","key":"3747_CR53","doi-asserted-by":"publisher","first-page":"609","DOI":"10.1016\/S0167-9473(03)00009-4","volume":"45","author":"W Tan","year":"2004","unstructured":"Tan W, Gan F, Chang T: Using normal quantile plot to select an appropriate transformation to achieve normality. Computational Statistics & Data Analysis 2004, 45(3):609\u2013619. 10.1016\/S0167-9473(03)00009-4","journal-title":"Computational Statistics & Data Analysis"},{"key":"3747_CR54","doi-asserted-by":"publisher","first-page":"103","DOI":"10.2307\/2287775","volume":"77","author":"J Emerson","year":"1982","unstructured":"Emerson J, Stoto M: Exploratory Methods for Choosing Power Transformations. Journal of the American Statistical Association 1982, 77: 103\u2013108. 10.2307\/2287775","journal-title":"Journal of the American Statistical Association"},{"key":"3747_CR55","doi-asserted-by":"publisher","first-page":"68","DOI":"10.2307\/2280095","volume":"46","author":"J Massey","year":"1951","unstructured":"Massey J, Frank J: The Kolmogorov-Smirnov test for goodness of fit. Journal of the American Statistical Association 1951, 46: 68\u201378. 10.2307\/2280095","journal-title":"Journal of the American Statistical Association"},{"key":"3747_CR56","first-page":"1","volume":"314","author":"RJ Kuczmarski","year":"2000","unstructured":"Kuczmarski RJ, Ogden CL, Grummer-Strawn LM, Flegal KM, Guo SS, Wei R, Mei Z, Curtin LR, Roche AF, Johnson CL: CDC growth charts: United States. Adv Data 2000, 314: 1\u201327.","journal-title":"Adv Data"},{"key":"3747_CR57","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1111\/j.2517-6161.1989.tb01763.x","volume":"51","author":"A Kuk","year":"1989","unstructured":"Kuk A, Mak T: Median estimation in the presence of auxiliary information. Journal of the Royal Statistical Society Series B Methodological 1989, 51: 261\u2013269.","journal-title":"Journal of the Royal Statistical Society Series B Methodological"},{"issue":"4","key":"3747_CR58","first-page":"489","volume":"55","author":"JC Waterlow","year":"1977","unstructured":"Waterlow JC, Buzina R, Keller W, Lane JM, Nichaman MZ, Tanner JM: The presentation and use of height and weight data for comparing the nutritional status of groups of children under the age of 10 years. Bull World Health Organ 1977, 55(4):489\u2013498.","journal-title":"Bull World Health Organ"},{"key":"3747_CR59","first-page":"99","volume-title":"Proceedings of the MMUA: 2003","author":"M Indovina","year":"2003","unstructured":"Indovina M, Uludag U, Snelick R, Mink A, Jain A: Multimodal biometric authentication methods: a COTS approach. Proceedings of the MMUA: 2003 2003, 99\u2013106."},{"key":"3747_CR60","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1186\/1471-2105-4-24","volume":"4","author":"JM Sorace","year":"2003","unstructured":"Sorace JM, Zhan M: A data review and re-assessment of ovarian cancer serum proteomic profiling. BMC Bioinformatics 2003, 4: 24. 10.1186\/1471-2105-4-24","journal-title":"BMC Bioinformatics"},{"key":"3747_CR61","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1145\/312129.312264","volume-title":"Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining 1999","author":"J Wang","year":"1999","unstructured":"Wang J, Wang X, Lin K, Shasha D, Shapiro B, Zhang K: Evaluating A Class of Distance-Mapping Algorithms for Data Mining and Clustering. Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining 1999 1999, 307\u2013311. full_text"},{"key":"3747_CR62","doi-asserted-by":"publisher","first-page":"130","DOI":"10.1186\/1471-2105-7-130","volume":"7","author":"T Yamada","year":"2006","unstructured":"Yamada T, Kanehisa M, Goto S: Extraction of phylogenetic network modules from the metabolic network. BMC Bioinformatics 2006, 7: 130. 10.1186\/1471-2105-7-130","journal-title":"BMC Bioinformatics"},{"key":"3747_CR63","first-page":"5531","volume":"1","author":"W Fury","year":"2006","unstructured":"Fury W, Batliwalla F, Gregersen PK, Li W: Overlapping probabilities of top ranking gene lists, hypergeometric distribution, and stringency of gene selection criterion. Conf Proc IEEE Eng Med Biol Soc 2006, 1: 5531\u20135534. full_text","journal-title":"Conf Proc IEEE Eng Med Biol Soc"},{"issue":"139","key":"3747_CR64","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1080\/14786443608561573","volume":"21","author":"H Gonin","year":"1936","unstructured":"Gonin H: The use of factorial moments in the treatment of the hypergeometric distribution and in tests for regression. Philosophical Magazine Series 7 1936, 21(139):215\u2013226.","journal-title":"Philosophical Magazine Series 7"},{"key":"3747_CR65","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","volume":"57","author":"Y Benjamini","year":"1995","unstructured":"Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B Methodological 1995, 57: 289\u2013330.","journal-title":"Journal of the Royal Statistical Society Series B Methodological"},{"issue":"1","key":"3747_CR66","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1111\/j.1399-0004.2006.00708.x","volume":"71","author":"M Oti","year":"2007","unstructured":"Oti M, Brunner HG: The modular nature of genetic diseases. Clin Genet 2007, 71(1):1\u201311. 10.1111\/j.1399-0004.2006.00708.x","journal-title":"Clin Genet"},{"issue":"7","key":"3747_CR67","doi-asserted-by":"publisher","first-page":"545","DOI":"10.1038\/nrg1383","volume":"5","author":"HG Brunner","year":"2004","unstructured":"Brunner HG, van Driel MA: From syndrome families to functional genomics. Nat Rev Genet 2004, 5(7):545\u2013551. 10.1038\/nrg1383","journal-title":"Nat Rev Genet"},{"issue":"3","key":"3747_CR68","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1016\/j.tig.2007.12.005","volume":"24","author":"M Oti","year":"2008","unstructured":"Oti M, Huynen MA, Brunner HG: Phenome connections. Trends Genet 2008, 24(3):103\u2013106. 10.1016\/j.tig.2007.12.005","journal-title":"Trends Genet"},{"issue":"3","key":"3747_CR69","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1038\/nbt1295","volume":"25","author":"K Lage","year":"2007","unstructured":"Lage K, Karlberg EO, Storling ZM, Olason PI, Pedersen AG, Rigina O, Hinsby AM, Tumer Z, Pociot F, Tommerup N, et al.: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol 2007, 25(3):309\u2013316. 10.1038\/nbt1295","journal-title":"Nat Biotechnol"},{"issue":"52","key":"3747_CR70","doi-asserted-by":"publisher","first-page":"20870","DOI":"10.1073\/pnas.0810772105","volume":"105","author":"K Lage","year":"2008","unstructured":"Lage K, Hansen NT, Karlberg EO, Eklund AC, Roque FS, Donahoe PK, Szallasi Z, Jensen TS, Brunak S: A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc Natl Acad Sci USA 2008, 105(52):20870\u201320875. 10.1073\/pnas.0810772105","journal-title":"Proc Natl Acad Sci USA"},{"issue":"4","key":"3747_CR71","doi-asserted-by":"publisher","first-page":"364","DOI":"10.1111\/j.1399-0004.2008.01135.x","volume":"75","author":"S Girirajan","year":"2009","unstructured":"Girirajan S, Truong HT, Blanchard CL, Elsea SH: A functional network module for Smith-Magenis syndrome. Clin Genet 2009, 75(4):364\u2013374. 10.1111\/j.1399-0004.2008.01135.x","journal-title":"Clin Genet"},{"issue":"1","key":"3747_CR72","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10048-007-0116-y","volume":"9","author":"LB Moran","year":"2008","unstructured":"Moran LB, Graeber MB: Towards a pathway definition of Parkinson's disease: a complex disorder with links to cancer, diabetes and inflammation. Neurogenetics 2008, 9(1):1\u201313. 10.1007\/s10048-007-0116-y","journal-title":"Neurogenetics"},{"issue":"2","key":"3747_CR73","doi-asserted-by":"publisher","first-page":"e4346","DOI":"10.1371\/journal.pone.0004346","volume":"4","author":"Y Li","year":"2009","unstructured":"Li Y, Agarwal P: A pathway-based view of human diseases and disease relationships. PLoS One 2009, 4(2):e4346. 10.1371\/journal.pone.0004346","journal-title":"PLoS One"},{"issue":"4","key":"3747_CR74","doi-asserted-by":"publisher","first-page":"801","DOI":"10.1016\/j.cell.2006.03.032","volume":"125","author":"J Lim","year":"2006","unstructured":"Lim J, Hao T, Shaw C, Patel AJ, Szabo G, Rual JF, Fisk CJ, Li N, Smolyar A, Hill DE, et al.: A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell 2006, 125(4):801\u2013814. 10.1016\/j.cell.2006.03.032","journal-title":"Cell"},{"issue":"6","key":"3747_CR75","doi-asserted-by":"publisher","first-page":"853","DOI":"10.1016\/j.molcel.2004.09.016","volume":"15","author":"H Goehler","year":"2004","unstructured":"Goehler H, Lalowski M, Stelzl U, Waelter S, Stroedicke M, Worm U, Droege A, Lindenberg KS, Knoblich M, Haenig C, et al.: A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease. Mol Cell 2004, 15(6):853\u2013865. 10.1016\/j.molcel.2004.09.016","journal-title":"Mol Cell"},{"issue":"11","key":"3747_CR76","doi-asserted-by":"publisher","first-page":"R253","DOI":"10.1186\/gb-2007-8-11-r253","volume":"8","author":"R Bergholdt","year":"2007","unstructured":"Bergholdt R, Storling ZM, Lage K, Karlberg EO, Olason PI, Aalund M, Nerup J, Brunak S, Workman CT, Pociot F: Integrative analysis for finding genes and networks involved in diabetes and other complex diseases. Genome Biol 2007, 8(11):R253. 10.1186\/gb-2007-8-11-r253","journal-title":"Genome Biol"},{"key":"3747_CR77","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1038\/msb4100125","volume":"3","author":"A Ergun","year":"2007","unstructured":"Ergun A, Lawrence CA, Kohanski MA, Brennan TA, Collins JJ: A network biology approach to prostate cancer. Mol Syst Biol 2007, 3: 82. 10.1038\/msb4100125","journal-title":"Mol Syst Biol"},{"issue":"17","key":"3747_CR78","doi-asserted-by":"publisher","first-page":"2549","DOI":"10.1016\/j.febslet.2008.06.023","volume":"582","author":"X Jiang","year":"2008","unstructured":"Jiang X, Liu B, Jiang J, Zhao H, Fan M, Zhang J, Fan Z, Jiang T: Modularity in the genetic disease-phenotype network. FEBS Lett 2008, 582(17):2549\u20132554. 10.1016\/j.febslet.2008.06.023","journal-title":"FEBS Lett"},{"issue":"5","key":"3747_CR79","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1038\/nrd2031","volume":"5","author":"T Kaletta","year":"2006","unstructured":"Kaletta T, Hengartner MO: Finding function in novel targets: C. elegans as a model organism. Nat Rev Drug Discov 2006, 5(5):387\u2013398. 10.1038\/nrd2031","journal-title":"Nat Rev Drug Discov"},{"issue":"8","key":"3747_CR80","doi-asserted-by":"publisher","first-page":"3278","DOI":"10.1182\/blood-2004-08-3073","volume":"105","author":"DM Langenau","year":"2005","unstructured":"Langenau DM, Jette C, Berghmans S, Palomero T, Kanki JP, Kutok JL, Look AT: Suppression of apoptosis by bcl-2 overexpression in lymphoid cells of transgenic zebrafish. Blood 2005, 105(8):3278\u20133285. 10.1182\/blood-2004-08-3073","journal-title":"Blood"},{"issue":"16","key":"3747_CR81","doi-asserted-by":"publisher","first-page":"i119","DOI":"10.1093\/bioinformatics\/btn291","volume":"24","author":"S Yu","year":"2008","unstructured":"Yu S, Van Vooren S, Tranchevent LC, De Moor B, Moreau Y: Comparison of vocabularies, representations and ranking algorithms for gene prioritization by text mining. Bioinformatics 2008, 24(16):i119\u2013125. 10.1093\/bioinformatics\/btn291","journal-title":"Bioinformatics"},{"issue":"2","key":"3747_CR82","doi-asserted-by":"publisher","first-page":"265","DOI":"10.1093\/bioinformatics\/btm558","volume":"24","author":"D Yang","year":"2008","unstructured":"Yang D, Li Y, Xiao H, Liu Q, Zhang M, Zhu J, Ma W, Yao C, Wang J, Wang D, et al.: Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories. Bioinformatics 2008, 24(2):265\u2013271. 10.1093\/bioinformatics\/btm558","journal-title":"Bioinformatics"},{"issue":"5701","key":"3747_CR83","doi-asserted-by":"publisher","first-page":"1555","DOI":"10.1126\/science.1099511","volume":"306","author":"I Lee","year":"2004","unstructured":"Lee I, Date SV, Adai AT, Marcotte EM: A probabilistic functional network of yeast genes. Science 2004, 306(5701):1555\u20131558. 10.1126\/science.1099511","journal-title":"Science"},{"issue":"6","key":"3747_CR84","doi-asserted-by":"publisher","first-page":"1011","DOI":"10.1086\/504300","volume":"78","author":"L Franke","year":"2006","unstructured":"Franke L, van Bakel H, Fokkens L, de Jong ED, Egmont-Petersen M, Wijmenga C: Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet 2006, 78(6):1011\u20131025. 10.1086\/504300","journal-title":"Am J Hum Genet"},{"issue":"10","key":"3747_CR85","doi-asserted-by":"publisher","first-page":"1090","DOI":"10.1038\/ng1434","volume":"36","author":"E Segal","year":"2004","unstructured":"Segal E, Friedman N, Koller D, Regev A: A module map showing conditional activity of expression modules in cancer. Nat Genet 2004, 36(10):1090\u20131098. 10.1038\/ng1434","journal-title":"Nat Genet"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-11-290.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T10:18:26Z","timestamp":1740133106000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-11-290"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,5,28]]},"references-count":85,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,12]]}},"alternative-id":["3747"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-11-290","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2010,5,28]]},"assertion":[{"value":"9 February 2010","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 May 2010","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 May 2010","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"290"}}