{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,1]],"date-time":"2025-11-01T05:02:17Z","timestamp":1761973337168,"version":"build-2065373602"},"reference-count":10,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2013,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Characterising genetic diversity through the analysis of massively parallel sequencing (MPS) data offers enormous potential to significantly improve our understanding of the genetic basis for observed phenotypes, including predisposition to and progression of complex human disease. Great challenges remain in resolving genetic variants that are genuine from the millions of artefactual signals.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>FAVR is a suite of new methods designed to work with commonly used MPS analysis pipelines to assist in the resolution of some of the issues related to the analysis of the vast amount of resulting data, with a focus on relatively rare genetic variants. To the best of our knowledge, no equivalent method has previously been described. The most important and novel aspect of FAVR is the use of signatures in comparator sequence alignment files during variant filtering, and annotation of variants potentially shared between individuals. The FAVR methods use these signatures to facilitate filtering of (i) platform and\/or mapping-specific artefacts, (ii) common genetic variants, and, where relevant, (iii) artefacts derived from imbalanced paired-end sequencing, as well as annotation of genetic variants based on evidence of co-occurrence in individuals. We applied conventional variant calling applied to whole-exome sequencing datasets, produced using both SOLiD and TruSeq chemistries, with or without downstream processing by FAVR methods. We demonstrate a 3-fold smaller rare single nucleotide variant shortlist with no detected reduction in sensitivity. This analysis included Sanger sequencing of rare variant signals not evident in dbSNP131, assessment of known variant signal preservation, and comparison of observed and expected rare variant numbers across a range of first cousin pairs. The principles described herein were applied in our recent publication identifying <jats:italic>XRCC2<\/jats:italic> as a new breast cancer risk gene and have been made publically available as a suite of software tools.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>FAVR is a platform-agnostic suite of methods that significantly enhances the analysis of large volumes of sequencing data for the study of rare genetic variants and their influence on phenotypes.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-14-65","type":"journal-article","created":{"date-parts":[[2013,2,25]],"date-time":"2013-02-25T13:14:21Z","timestamp":1361798061000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["FAVR (Filtering and Annotation of Variants that are Rare): methods to facilitate the analysis of rare germline genetic variants from massively parallel sequencing datasets"],"prefix":"10.1186","volume":"14","author":[{"given":"Bernard J","family":"Pope","sequence":"first","affiliation":[]},{"given":"T\u00fa","family":"Nguyen-Dumont","sequence":"additional","affiliation":[]},{"given":"Fabrice","family":"Odefrey","sequence":"additional","affiliation":[]},{"given":"Fleur","family":"Hammet","sequence":"additional","affiliation":[]},{"given":"Russell","family":"Bell","sequence":"additional","affiliation":[]},{"given":"Kayoko","family":"Tao","sequence":"additional","affiliation":[]},{"given":"Sean V","family":"Tavtigian","sequence":"additional","affiliation":[]},{"given":"David E","family":"Goldgar","sequence":"additional","affiliation":[]},{"given":"Andrew","family":"Lonie","sequence":"additional","affiliation":[]},{"given":"Melissa C","family":"Southey","sequence":"additional","affiliation":[]},{"given":"Daniel J","family":"Park","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2013,2,25]]},"reference":[{"key":"5736_CR1","doi-asserted-by":"publisher","first-page":"734","DOI":"10.1016\/j.ajhg.2012.02.027","volume":"90","author":"DJ Park","year":"2012","unstructured":"Park DJ, Lesueur F, Nguyen-Dumont T: Rare mutations in XRCC2 increase the risk of breast cancer. Am J Hum Genet 2012, 90: 734-739. 10.1016\/j.ajhg.2012.02.027","journal-title":"Am J Hum Genet"},{"key":"5736_CR2","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1038\/ng.2007.53","volume":"40","author":"MR Stratton","year":"2008","unstructured":"Stratton MR, Rahman N: The emerging landscape of breast cancer susceptibility. Nat Genet 2008, 40: 17-22. 10.1038\/ng.2007.53","journal-title":"Nat Genet"},{"key":"5736_CR3","doi-asserted-by":"publisher","first-page":"519","DOI":"10.1038\/ng.823","volume":"43","author":"J Yang","year":"2011","unstructured":"Yang J, Manolio TA, Pasquale LR: Genome partitioning of genetic variation for complex traits using common SNPs. Nat Genet 2011, 43: 519-525. 10.1038\/ng.823","journal-title":"Nat Genet"},{"key":"5736_CR4","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1038\/ng.499","volume":"42","author":"SB Ng","year":"2010","unstructured":"Ng SB, Buckingham KJ, Lee C: Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 2010, 42: 30-35. 10.1038\/ng.499","journal-title":"Nat Genet"},{"key":"5736_CR5","doi-asserted-by":"publisher","first-page":"415","DOI":"10.1038\/nrg2779","volume":"11","author":"ET Cirulli","year":"2010","unstructured":"Cirulli ET, Goldstein DB: Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 2010, 11: 415-425. 10.1038\/nrg2779","journal-title":"Nat Rev Genet"},{"key":"5736_CR6","doi-asserted-by":"publisher","first-page":"e116","DOI":"10.1093\/nar\/gkq072","volume":"38","author":"M Mokry","year":"2010","unstructured":"Mokry M, Feitsma H, Nijman IJ: Accurate SNP and mutation detection by targeted custom microarray-based genomic enrichment of short-fragment sequencing libraries. Nucleic Acids Res 2010, 38: e116. 10.1093\/nar\/gkq072","journal-title":"Nucleic Acids Res"},{"key":"5736_CR7","doi-asserted-by":"publisher","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","volume":"25","author":"H Li","year":"2009","unstructured":"Li H, Handsaker B, Wysoker A: The sequence alignment\/Map format and SAMtools. Bioinformatics 2009, 25: 2078-2079. 10.1093\/bioinformatics\/btp352","journal-title":"Bioinformatics"},{"key":"5736_CR8","doi-asserted-by":"publisher","first-page":"2156","DOI":"10.1093\/bioinformatics\/btr330","volume":"27","author":"P Danecek","year":"2011","unstructured":"Danecek P, Auton A, Abecasis G: The variant call format and VCFtools. Bioinformatics 2011, 27: 2156-2158. 10.1093\/bioinformatics\/btr330","journal-title":"Bioinformatics"},{"key":"5736_CR9","doi-asserted-by":"publisher","first-page":"e164","DOI":"10.1093\/nar\/gkq603","volume":"38","author":"K Wang","year":"2010","unstructured":"Wang K, Li M, Hakonarson H: ANNOVAR: Functional annotation of genetic variants from next-generation sequencing data. Nucleic Acids Res 2010, 38: e164. 10.1093\/nar\/gkq603","journal-title":"Nucleic Acids Res"},{"key":"5736_CR10","doi-asserted-by":"publisher","first-page":"491","DOI":"10.1038\/ng.806","volume":"43","author":"MA DePristo","year":"2011","unstructured":"DePristo MA, Banks E, Poplin RE: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genetics 2011, 43: 491-498. 10.1038\/ng.806","journal-title":"Nat Genetics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-14-65.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T22:23:25Z","timestamp":1630535005000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-14-65"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,2,25]]},"references-count":10,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,12]]}},"alternative-id":["5736"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-14-65","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2013,2,25]]},"assertion":[{"value":"13 April 2012","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 December 2012","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 February 2013","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"65"}}