{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T13:28:15Z","timestamp":1768483695661,"version":"3.49.0"},"reference-count":57,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2007,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Human genetic variations primarily result from single nucleotide polymorphisms (SNPs) that occur approximately every 1000 bases in the overall human population. The non-synonymous SNPs (nsSNPs) that lead to amino acid changes in the protein product may account for nearly half of the known genetic variations linked to inherited human diseases. One of the key problems of medical genetics today is to identify nsSNPs that underlie disease-related phenotypes in humans. As such, the development of computational tools that can identify such nsSNPs would enhance our understanding of genetic diseases and help predict the disease.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We propose a method, named Parepro (<jats:underline>P<\/jats:underline> redicting the <jats:underline>a<\/jats:underline> mino acid <jats:underline>re<\/jats:underline> placement <jats:underline>pro<\/jats:underline> bability), to identify nsSNPs having either deleterious or neutral effects on the resulting protein function. Two independent datasets, HumVar and NewHumVar, taken from the PhD-SNP server, were applied to train the model and test the robustness of Parepro. Using a 20-fold cross validation test on the HumVar dataset, Parepro achieved a Matthews correlation coefficient (MCC) of 50% and an overall accuracy (Q2) of 76%, both of which were higher than those predicted by the methods, such as PolyPhen, SIFT, and HydridMeth. Further analysis on an additional dataset (NewHumVar) using Parepro yielded similar results.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>The performance of Parepro indicates that it is a powerful tool for predicting the effect of nsSNPs on protein function and would be useful for large-scale analysis of genomic nsSNP data.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-8-450","type":"journal-article","created":{"date-parts":[[2007,11,16]],"date-time":"2007-11-16T07:13:27Z","timestamp":1195197207000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":44,"title":["Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines"],"prefix":"10.1186","volume":"8","author":[{"given":"Jian","family":"Tian","sequence":"first","affiliation":[]},{"given":"Ningfeng","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Xuexia","family":"Guo","sequence":"additional","affiliation":[]},{"given":"Jun","family":"Guo","sequence":"additional","affiliation":[]},{"given":"Juhua","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Yunliu","family":"Fan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2007,11,16]]},"reference":[{"issue":"12","key":"1822_CR1","doi-asserted-by":"crossref","first-page":"1229","DOI":"10.1101\/gr.8.12.1229","volume":"8","author":"FS Collins","year":"1998","unstructured":"Collins FS, Brooks LD, Chakravarti A: A DNA polymorphism discovery resource for research on human genetic variation. Genome Res. 1998, 8 (12): 1229-1231.","journal-title":"Genome Res"},{"issue":"5","key":"1822_CR2","doi-asserted-by":"publisher","first-page":"1263","DOI":"10.1016\/j.jmb.2005.12.025","volume":"356","author":"P Yue","year":"2006","unstructured":"Yue P, Moult J: Identification and analysis of deleterious human SNPs. J Mol Biol. 2006, 356 (5): 1263-1274. 10.1016\/j.jmb.2005.12.025.","journal-title":"J Mol Biol"},{"issue":"17","key":"1822_CR3","doi-asserted-by":"publisher","first-page":"3894","DOI":"10.1093\/nar\/gkf493","volume":"30","author":"V Ramensky","year":"2002","unstructured":"Ramensky V, Bork P, Sunyaev S: Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 2002, 30 (17): 3894-3900. 10.1093\/nar\/gkf493.","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"1822_CR4","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1002\/humu.22","volume":"17","author":"Z Wang","year":"2001","unstructured":"Wang Z, Moult J: SNPs, protein structure, and disease. Hum Mutat. 2001, 17 (4): 263-270. 10.1002\/humu.22.","journal-title":"Hum Mutat"},{"issue":"1","key":"1822_CR5","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1093\/nar\/26.1.285","volume":"26","author":"DN Cooper","year":"1998","unstructured":"Cooper DN, Ball EV, Krawczak M: The human gene mutation database. Nucleic Acids Res. 1998, 26 (1): 285-287. 10.1093\/nar\/26.1.285.","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"1822_CR6","doi-asserted-by":"publisher","first-page":"577","DOI":"10.1002\/humu.10212","volume":"21","author":"PD Stenson","year":"2003","unstructured":"Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NS, Abeysinghe S, Krawczak M, Cooper DN: Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat. 2003, 21 (6): 577-581. 10.1002\/humu.10212.","journal-title":"Hum Mutat"},{"issue":"12","key":"1822_CR7","doi-asserted-by":"publisher","first-page":"2814","DOI":"10.1093\/bioinformatics\/bti442","volume":"21","author":"R Karchin","year":"2005","unstructured":"Karchin R, Diekhans M, Kelly L, Thomas DJ, Pieper U, Eswar N, Haussler D, Sali A: LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics. 2005, 21 (12): 2814-2820. 10.1093\/bioinformatics\/bti442.","journal-title":"Bioinformatics"},{"issue":"3","key":"1822_CR8","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1101\/gr.212802","volume":"12","author":"PC Ng","year":"2002","unstructured":"Ng PC, Henikoff S: Accounting for human polymorphisms predicted to affect protein function. Genome Res. 2002, 12 (3): 436-446. 10.1101\/gr.212802.","journal-title":"Genome Res"},{"issue":"5","key":"1822_CR9","doi-asserted-by":"publisher","first-page":"1317","DOI":"10.1093\/nar\/gkj518","volume":"34","author":"E Mathe","year":"2006","unstructured":"Mathe E, Olivier M, Kato S, Ishioka C, Hainaut P, Tavtigian SV: Computational approaches for predicting the biological effect of p53 missense mutations: a comparison of three sequence analysis based methods. Nucleic Acids Res. 2006, 34 (5): 1317-1325. 10.1093\/nar\/gkj518.","journal-title":"Nucleic Acids Res"},{"issue":"22","key":"1822_CR10","doi-asserted-by":"publisher","first-page":"2729","DOI":"10.1093\/bioinformatics\/btl423","volume":"22","author":"E Capriotti","year":"2006","unstructured":"Capriotti E, Calabrese R, Casadio R: Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics. 2006, 22 (22): 2729-2734. 10.1093\/bioinformatics\/btl423.","journal-title":"Bioinformatics"},{"issue":"14","key":"1822_CR11","doi-asserted-by":"publisher","first-page":"3176","DOI":"10.1093\/bioinformatics\/bti486","volume":"21","author":"C Ferrer-Costa","year":"2005","unstructured":"Ferrer-Costa C, Gelpi JL, Zamakola L, Parraga I, de la Cruz X, Orozco M: PMUT: a web-based tool for the annotation of pathological mutations on proteins. Bioinformatics. 2005, 21 (14): 3176-3178. 10.1093\/bioinformatics\/bti486.","journal-title":"Bioinformatics"},{"issue":"Suppl 2","key":"1822_CR12","doi-asserted-by":"publisher","first-page":"ii54","DOI":"10.1093\/bioinformatics\/bti1109","volume":"21","author":"E Capriotti","year":"2005","unstructured":"Capriotti E, Fariselli P, Calabrese R, Casadio R: Predicting protein stability changes from sequences using support vector machines. Bioinformatics. 2005, 21 (Suppl 2): ii54-58. 10.1093\/bioinformatics\/bti1109.","journal-title":"Bioinformatics"},{"issue":"6","key":"1822_CR13","doi-asserted-by":"publisher","first-page":"e83","DOI":"10.1371\/journal.pgen.0010083","volume":"1","author":"LR Brunham","year":"2005","unstructured":"Brunham LR, Singaraja RR, Pape TD, Kejariwal A, Thomas PD, Hayden MR: Accurate prediction of the functional significance of single nucleotide polymorphisms and mutations in the ABCA1 gene. PLoS Genet. 2005, 1 (6): e83-10.1371\/journal.pgen.0010083.","journal-title":"PLoS Genet"},{"issue":"11","key":"1822_CR14","doi-asserted-by":"publisher","first-page":"1974","DOI":"10.1373\/clinchem.2004.036053","volume":"50","author":"D Tchernitchko","year":"2004","unstructured":"Tchernitchko D, Goossens M, Wajcman H: In silico prediction of the deleterious effect of a mutation: proceed with caution in clinical genetics. Clin Chem. 2004, 50 (11): 1974-1978. 10.1373\/clinchem.2004.036053.","journal-title":"Clin Chem"},{"issue":"9","key":"1822_CR15","doi-asserted-by":"publisher","first-page":"2129","DOI":"10.1101\/gr.772403","volume":"13","author":"PD Thomas","year":"2003","unstructured":"Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, Diemer K, Muruganujan A, Narechania A: PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 2003, 13 (9): 2129-2141. 10.1101\/gr.772403.","journal-title":"Genome Res"},{"issue":"13","key":"1822_CR16","doi-asserted-by":"publisher","first-page":"3812","DOI":"10.1093\/nar\/gkg509","volume":"31","author":"PC Ng","year":"2003","unstructured":"Ng PC, Henikoff S: SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003, 31 (13): 3812-3814. 10.1093\/nar\/gkg509.","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"1822_CR17","doi-asserted-by":"publisher","first-page":"1151","DOI":"10.1073\/pnas.0237285100","volume":"100","author":"MA Fleming","year":"2003","unstructured":"Fleming MA, Potter JD, Ramirez CJ, Ostrander GK, Ostrander EA: Understanding missense mutations in the BRCA1 gene: an evolutionary approach. Proc Natl Acad Sci USA. 2003, 100 (3): 1151-1156. 10.1073\/pnas.0237285100.","journal-title":"Proc Natl Acad Sci USA"},{"issue":"7","key":"1822_CR18","doi-asserted-by":"publisher","first-page":"978","DOI":"10.1101\/gr.3804205","volume":"15","author":"EA Stone","year":"2005","unstructured":"Stone EA, Sidow A: Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res. 2005, 15 (7): 978-986. 10.1101\/gr.3804205.","journal-title":"Genome Res"},{"issue":"4","key":"1822_CR19","doi-asserted-by":"publisher","first-page":"891","DOI":"10.1016\/S0022-2836(02)00813-6","volume":"322","author":"CT Saunders","year":"2002","unstructured":"Saunders CT, Baker D: Evaluation of structural and evolutionary contributions to deleterious mutation prediction. J Mol Biol. 2002, 322 (4): 891-901. 10.1016\/S0022-2836(02)00813-6.","journal-title":"J Mol Biol"},{"key":"1822_CR20","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1186\/1471-2105-7-217","volume":"7","author":"RJ Dobson","year":"2006","unstructured":"Dobson RJ, Munroe PB, Caulfield MJ, Saqi MA: Predicting deleterious nsSNPs: an analysis of sequence and structural attributes. BMC Bioinformatics. 2006, 7: 217-10.1186\/1471-2105-7-217.","journal-title":"BMC Bioinformatics"},{"issue":"10","key":"1822_CR21","doi-asserted-by":"publisher","first-page":"2185","DOI":"10.1093\/bioinformatics\/bti365","volume":"21","author":"L Bao","year":"2005","unstructured":"Bao L, Cui Y: Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information. Bioinformatics. 2005, 21 (10): 2185-2190. 10.1093\/bioinformatics\/bti365.","journal-title":"Bioinformatics"},{"issue":"17","key":"1822_CR22","doi-asserted-by":"publisher","first-page":"2199","DOI":"10.1093\/bioinformatics\/btg297","volume":"19","author":"VG Krishnan","year":"2003","unstructured":"Krishnan VG, Westhead DR: A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics. 2003, 19 (17): 2199-2209. 10.1093\/bioinformatics\/btg297.","journal-title":"Bioinformatics"},{"issue":"5","key":"1822_CR23","doi-asserted-by":"publisher","first-page":"863","DOI":"10.1101\/gr.176601","volume":"11","author":"PC Ng","year":"2001","unstructured":"Ng PC, Henikoff S: Predicting deleterious amino acid substitutions. Genome Res. 2001, 11 (5): 863-874. 10.1101\/gr.176601.","journal-title":"Genome Res"},{"issue":"1","key":"1822_CR24","doi-asserted-by":"publisher","first-page":"447","DOI":"10.1006\/jmbi.2000.4474","volume":"307","author":"A Armon","year":"2001","unstructured":"Armon A, Graur D, Ben-Tal N: ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J Mol Biol. 2001, 307 (1): 447-463. 10.1006\/jmbi.2000.4474.","journal-title":"J Mol Biol"},{"issue":"Web Server","key":"1822_CR25","doi-asserted-by":"publisher","first-page":"W299","DOI":"10.1093\/nar\/gki370","volume":"33","author":"M Landau","year":"2005","unstructured":"Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N: ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res. 2005, 33 (Web Server): W299-302. 10.1093\/nar\/gki370.","journal-title":"Nucleic Acids Res"},{"issue":"Suppl 1","key":"1822_CR26","doi-asserted-by":"publisher","first-page":"S71","DOI":"10.1093\/bioinformatics\/18.suppl_1.S71","volume":"18","author":"T Pupko","year":"2002","unstructured":"Pupko T, Bell RE, Mayrose I, Glaser F, Ben-Tal N: Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics. 2002, 18 (Suppl 1): S71-77.","journal-title":"Bioinformatics"},{"issue":"10","key":"1822_CR27","doi-asserted-by":"publisher","first-page":"3193","DOI":"10.1093\/nar\/gki633","volume":"33","author":"H Chen","year":"2005","unstructured":"Chen H, Zhou HX: Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res. 2005, 33 (10): 3193-3199. 10.1093\/nar\/gki633.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"1822_CR28","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1002\/prot.20092","volume":"56","author":"NK Natt","year":"2004","unstructured":"Natt NK, Kaur H, Raghava GP: Prediction of transmembrane regions of beta-barrel proteins using ANN- and SVM-based methods. Proteins. 2004, 56 (1): 11-18. 10.1002\/prot.20092.","journal-title":"Proteins"},{"issue":"Web Server","key":"1822_CR29","doi-asserted-by":"publisher","first-page":"W414","DOI":"10.1093\/nar\/gkh350","volume":"32","author":"M Bhasin","year":"2004","unstructured":"Bhasin M, Raghava GP: ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Nucleic Acids Res. 2004, 32 (Web Server): W414-419. 10.1093\/nar\/gkh350.","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"1822_CR30","first-page":"67","volume":"2","author":"E Byvatov","year":"2003","unstructured":"Byvatov E, Schneider G: Support vector machine applications in bioinformatics. Appl Bioinformatics. 2003, 2 (2): 67-77.","journal-title":"Appl Bioinformatics"},{"issue":"4","key":"1822_CR31","doi-asserted-by":"publisher","first-page":"349","DOI":"10.1093\/bioinformatics\/17.4.349","volume":"17","author":"CH Ding","year":"2001","unstructured":"Ding CH, Dubchak I: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics. 2001, 17 (4): 349-358. 10.1093\/bioinformatics\/17.4.349.","journal-title":"Bioinformatics"},{"issue":"9","key":"1822_CR32","doi-asserted-by":"publisher","first-page":"799","DOI":"10.1093\/bioinformatics\/16.9.799","volume":"16","author":"A Zien","year":"2000","unstructured":"Zien A, Ratsch G, Mika S, Scholkopf B, Lengauer T, Muller KR: Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics. 2000, 16 (9): 799-807. 10.1093\/bioinformatics\/16.9.799.","journal-title":"Bioinformatics"},{"issue":"1\u20132","key":"1822_CR33","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1089\/10665270050081405","volume":"7","author":"T Jaakkola","year":"2000","unstructured":"Jaakkola T, Diekhans M, Haussler D: A discriminative framework for detecting remote protein homologies. J Comput Biol. 2000, 7 (1\u20132): 95-114. 10.1089\/10665270050081405.","journal-title":"J Comput Biol"},{"issue":"10","key":"1822_CR34","doi-asserted-by":"publisher","first-page":"906","DOI":"10.1093\/bioinformatics\/16.10.906","volume":"16","author":"TS Furey","year":"2000","unstructured":"Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics. 2000, 16 (10): 906-914. 10.1093\/bioinformatics\/16.10.906.","journal-title":"Bioinformatics"},{"issue":"1","key":"1822_CR35","doi-asserted-by":"publisher","first-page":"262","DOI":"10.1073\/pnas.97.1.262","volume":"97","author":"MP Brown","year":"2000","unstructured":"Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares M, Haussler D: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA. 2000, 97 (1): 262-267. 10.1073\/pnas.97.1.262.","journal-title":"Proc Natl Acad Sci USA"},{"issue":"3","key":"1822_CR36","doi-asserted-by":"publisher","first-page":"278","DOI":"10.1093\/bioinformatics\/bti810","volume":"22","author":"S Idicula-Thomas","year":"2006","unstructured":"Idicula-Thomas S, Kulkarni AJ, Kulkarni BD, Jayaraman VK, Balaji PV: A support vector machine-based method for predicting the propensity of a protein to be soluble or to form inclusion body on overexpression in Escherichia coli. Bioinformatics. 2006, 22 (3): 278-284. 10.1093\/bioinformatics\/bti810.","journal-title":"Bioinformatics"},{"issue":"5","key":"1822_CR37","doi-asserted-by":"publisher","first-page":"689","DOI":"10.1093\/bioinformatics\/18.5.689","volume":"18","author":"N Zavaljevski","year":"2002","unstructured":"Zavaljevski N, Stevens FJ, Reifman J: Support vector machines with selective kernel scaling for protein classification and identification of key amino acid positions. Bioinformatics. 2002, 18 (5): 689-696. 10.1093\/bioinformatics\/18.5.689.","journal-title":"Bioinformatics"},{"key":"1822_CR38","volume-title":"Support Vector Machines and other kernel-based learning methods","author":"C N","year":"2000","unstructured":"N C: Support Vector Machines and other kernel-based learning methods. 2000, Cambridge University Press"},{"issue":"1","key":"1822_CR39","doi-asserted-by":"publisher","first-page":"368","DOI":"10.1093\/nar\/27.1.368","volume":"27","author":"S Kawashima","year":"1999","unstructured":"Kawashima S, Ogata H, Kanehisa M: AAindex: Amino Acid Index Database. Nucleic Acids Res. 1999, 27 (1): 368-369. 10.1093\/nar\/27.1.368.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"1822_CR40","doi-asserted-by":"publisher","first-page":"374","DOI":"10.1093\/nar\/28.1.374","volume":"28","author":"S Kawashima","year":"2000","unstructured":"Kawashima S, Kanehisa M: AAindex: amino acid index database. Nucleic Acids Res. 2000, 28 (1): 374-10.1093\/nar\/28.1.374.","journal-title":"Nucleic Acids Res"},{"issue":"5","key":"1822_CR41","doi-asserted-by":"publisher","first-page":"412","DOI":"10.1093\/bioinformatics\/16.5.412","volume":"16","author":"P Baldi","year":"2000","unstructured":"Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics. 2000, 16 (5): 412-424. 10.1093\/bioinformatics\/16.5.412.","journal-title":"Bioinformatics"},{"issue":"2","key":"1822_CR42","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","volume":"405","author":"BW Matthews","year":"1975","unstructured":"Matthews BW: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975, 405 (2): 442-451.","journal-title":"Biochim Biophys Acta"},{"issue":"4","key":"1822_CR43","doi-asserted-by":"publisher","first-page":"1125","DOI":"10.1002\/prot.20810","volume":"62","author":"J Cheng","year":"2006","unstructured":"Cheng J, Randall A, Baldi P: Prediction of protein stability changes for single-site mutations using support vector machines. Proteins. 2006, 62 (4): 1125-1132. 10.1002\/prot.20810.","journal-title":"Proteins"},{"issue":"Suppl 1","key":"1822_CR44","doi-asserted-by":"publisher","first-page":"i63","DOI":"10.1093\/bioinformatics\/bth928","volume":"20","author":"E Capriotti","year":"2004","unstructured":"Capriotti E, Fariselli P, Casadio R: A neural-network-based method for predicting protein stability changes upon single point mutations. Bioinformatics. 2004, 20 (Suppl 1): i63-68. 10.1093\/bioinformatics\/bth928.","journal-title":"Bioinformatics"},{"key":"1822_CR45","first-page":"47","volume":"1","author":"M Brown","year":"1993","unstructured":"Brown M, Hughey R, Krogh A, Mian IS, Sjolander K, Haussler D: Using Dirichlet mixture priors to derive hidden Markov models for protein families. Proc Int Conf Intell Syst Mol Biol. 1993, 1: 47-55.","journal-title":"Proc Int Conf Intell Syst Mol Biol"},{"issue":"17","key":"1822_CR46","doi-asserted-by":"publisher","first-page":"6576","DOI":"10.1073\/pnas.0305043101","volume":"101","author":"AY Lau","year":"2004","unstructured":"Lau AY, Chasman DI: Functional classification of proteins and protein variants. Proc Natl Acad Sci USA. 2004, 101 (17): 6576-6581. 10.1073\/pnas.0305043101.","journal-title":"Proc Natl Acad Sci USA"},{"issue":"4","key":"1822_CR47","first-page":"327","volume":"12","author":"K Sjolander","year":"1996","unstructured":"Sjolander K, Karplus K, Brown M, Hughey R, Krogh A, Mian IS, Haussler D: Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Comput Appl Biosci. 1996, 12 (4): 327-345.","journal-title":"Comput Appl Biosci"},{"issue":"17","key":"1822_CR48","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"SF Altschul","year":"1997","unstructured":"Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093\/nar\/25.17.3389.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"1822_CR49","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1093\/nar\/gkg095","volume":"31","author":"B Boeckmann","year":"2003","unstructured":"Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O'Donovan C, Phan I, Pilbout S, Schneider M: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 2003, 31 (1): 365-370. 10.1093\/nar\/gkg095.","journal-title":"Nucleic Acids Res"},{"issue":"22","key":"1822_CR50","doi-asserted-by":"publisher","first-page":"4673","DOI":"10.1093\/nar\/22.22.4673","volume":"22","author":"JD Thompson","year":"1994","unstructured":"Thompson JD, Higgins DG, Gibson TJ, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22 (22): 4673-4680. 10.1093\/nar\/22.22.4673.","journal-title":"Nucleic Acids Res"},{"issue":"24","key":"1822_CR51","doi-asserted-by":"publisher","first-page":"4876","DOI":"10.1093\/nar\/25.24.4876","volume":"25","author":"JD Thompson","year":"1997","unstructured":"Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997, 25 (24): 4876-4882. 10.1093\/nar\/25.24.4876.","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"1822_CR52","doi-asserted-by":"publisher","first-page":"574","DOI":"10.1016\/0022-2836(94)90032-9","volume":"243","author":"S Henikoff","year":"1994","unstructured":"Henikoff S, Henikoff JG: Position-based sequence weights. J Mol Biol. 1994, 243 (4): 574-578. 10.1016\/0022-2836(94)90032-9.","journal-title":"J Mol Biol"},{"issue":"3","key":"1822_CR53","first-page":"275","volume":"8","author":"DT Jones","year":"1992","unstructured":"Jones DT, Taylor WR, Thornton JM: The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992, 8 (3): 275-282.","journal-title":"Comput Appl Biosci"},{"issue":"15","key":"1822_CR54","doi-asserted-by":"publisher","first-page":"2479","DOI":"10.1093\/bioinformatics\/bth261","volume":"20","author":"E Frank","year":"2004","unstructured":"Frank E, Hall M, Trigg L, Holmes G, Witten IH: Data mining in bioinformatics using Weka. Bioinformatics. 2004, 20 (15): 2479-2481. 10.1093\/bioinformatics\/bth261.","journal-title":"Bioinformatics"},{"issue":"1","key":"1822_CR55","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1002\/prot.340090107","volume":"9","author":"C Sander","year":"1991","unstructured":"Sander C, Schneider R: Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 1991, 9 (1): 56-68. 10.1002\/prot.340090107.","journal-title":"Proteins"},{"issue":"2","key":"1822_CR56","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1002\/prot.10146","volume":"48","author":"WS Valdar","year":"2002","unstructured":"Valdar WS: Scoring residue conservation. Proteins. 2002, 48 (2): 227-241. 10.1002\/prot.10146.","journal-title":"Proteins"},{"key":"1822_CR57","unstructured":"LIBSVM. [http:\/\/www.csie.ntu.edu.tw\/~cjlin\/]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-8-450.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T01:45:07Z","timestamp":1630460707000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-8-450"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,11,16]]},"references-count":57,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2007,12]]}},"alternative-id":["1822"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-8-450","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,11,16]]},"assertion":[{"value":"26 April 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 November 2007","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 November 2007","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"450"}}