{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,20]],"date-time":"2026-03-20T13:15:49Z","timestamp":1774012549218,"version":"3.50.1"},"reference-count":19,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2016,10,6]],"date-time":"2016-10-06T00:00:00Z","timestamp":1475712000000},"content-version":"vor","delay-in-days":2093,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is an important problem, as it is a prerequisite for classifying SVs, evaluating their functional impact and reconstructing personal genome sequences. Given approximate breakpoint locations and a bridging assembly or split read, the problem essentially reduces to finding a correct sequence alignment. Classical algorithms for alignment and their generalizations guarantee finding the optimal (in terms of scoring) global or local alignment of two sequences. However, they cannot generally be applied to finding the biologically correct alignment of genomic sequences containing SVs because of the need to simultaneously span the SV (e.g. make a large gap) and perform precise local alignments at the flanking ends.<\/jats:p>\n               <jats:p>Results: Here, we formulate the computations involved in this problem and describe a dynamic-programming algorithm for its solution. Specifically, our algorithm, called AGE for Alignment with Gap Excision, finds the optimal solution by simultaneously aligning the 5\u2032 and 3\u2032 ends of two given sequences and introducing a \u2018large-gap jump\u2019 between the local end alignments to maximize the total alignment score. We also describe extensions allowing the application of AGE to tandem duplications, inversions and complex events involving two large gaps. We develop a memory-efficient implementation of AGE (allowing application to long contigs) and make it available as a downloadable software package. Finally, we applied AGE for breakpoint determination and standardization in the 1000 Genomes Project by aligning locally assembled contigs to the human genome.<\/jats:p>\n               <jats:p>Availability and Implementation: AGE is freely available at http:\/\/sv.gersteinlab.org\/age.<\/jats:p>\n               <jats:p>Contact: \u00a0pi@gersteinlab.org<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq713","type":"journal-article","created":{"date-parts":[[2011,1,14]],"date-time":"2011-01-14T04:50:46Z","timestamp":1294980646000},"page":"595-603","source":"Crossref","is-referenced-by-count":83,"title":["AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision"],"prefix":"10.1093","volume":"27","author":[{"given":"Alexej","family":"Abyzov","sequence":"first","affiliation":[{"name":"1 Program in Computation Biology and Bioinformatics, 2Department of Molecular Biophysics and Biochemistry and 3Department of Computer Science, Yale University, New Haven, CT 06520, USA"},{"name":"1 Program in Computation Biology and Bioinformatics, 2Department of Molecular Biophysics and Biochemistry and 3Department of Computer Science, Yale University, New Haven, CT 06520, USA"}]},{"given":"Mark","family":"Gerstein","sequence":"additional","affiliation":[{"name":"1 Program in Computation Biology and Bioinformatics, 2Department of Molecular Biophysics and Biochemistry and 3Department of Computer Science, Yale University, New Haven, CT 06520, USA"},{"name":"1 Program in Computation Biology and Bioinformatics, 2Department of Molecular Biophysics and Biochemistry and 3Department of Computer Science, Yale University, New Haven, CT 06520, USA"},{"name":"1 Program in Computation Biology and Bioinformatics, 2Department of Molecular Biophysics and Biochemistry and 3Department of Computer Science, Yale University, New Haven, CT 06520, USA"}]}],"member":"286","published-online":{"date-parts":[[2011,1,13]]},"reference":[{"key":"2023012511565389900_B1","doi-asserted-by":"crossref","DOI":"10.1101\/gr.114876.110","article-title":"CNVnator: an approach to discover, genotype and characterize typical and atypical cnvs from family and population genome sequencing","author":"Abyzov","year":"2011","journal-title":"Genome Res."},{"key":"2023012511565389900_B2","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1089\/cmb.1994.1.271","article-title":"Recent developments in linear-space alignment methods: a survey","volume":"1","author":"Chao","year":"1994","journal-title":"J. Comput. Biol."},{"key":"2023012511565389900_B3","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/nature09534","article-title":"A map of human genome variation from population-scale sequencing","volume":"467","author":"Durbin","year":"2010","journal-title":"Nature"},{"key":"2023012511565389900_B4","doi-asserted-by":"crossref","first-page":"705","DOI":"10.1016\/0022-2836(82)90398-9","article-title":"An improved algorithm for matching biological sequences","volume":"162","author":"Gotoh","year":"1982","journal-title":"J. Mol. Biol."},{"key":"2023012511565389900_B5","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1145\/360825.360861","article-title":"A linear space algorithm for computing maximal common subsequences","volume":"18","author":"Hirschberg","year":"1975","journal-title":"Commun. ACM"},{"key":"2023012511565389900_B6","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1093\/bioinformatics\/19.2.228","article-title":"A generalized global alignment algorithm","volume":"19","author":"Huang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012511565389900_B7","first-page":"656","article-title":"BLAT\u2013the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023012511565389900_B8","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature06862","article-title":"Mapping and sequencing of structural variation from eight human genomes","volume":"453","author":"Kidd","year":"2008","journal-title":"Nature"},{"key":"2023012511565389900_B9","doi-asserted-by":"crossref","first-page":"R23","DOI":"10.1186\/gb-2009-10-2-r23","article-title":"PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data","volume":"10","author":"Korbel","year":"2009","journal-title":"Genome Biol."},{"key":"2023012511565389900_B10","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1038\/nbt.1600","article-title":"Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library","volume":"28","author":"Lam","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023012511565389900_B11","doi-asserted-by":"crossref","first-page":"e254","DOI":"10.1371\/journal.pbio.0050254","article-title":"The diploid genome sequence of an individual human","volume":"5","author":"Levy","year":"2007","journal-title":"PLoS Biol."},{"key":"2023012511565389900_B12","doi-asserted-by":"crossref","first-page":"S37","DOI":"10.1038\/ng2080","article-title":"Copy-number variation and association studies of human disease","volume":"39","author":"McCarroll","year":"2007","journal-title":"Nat. Genet."},{"key":"2023012511565389900_B13","doi-asserted-by":"crossref","first-page":"S13","DOI":"10.1038\/nmeth.1374","article-title":"Computational methods for discovering structural variation with next-generation sequencing","volume":"6","author":"Medvedev","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012511565389900_B14","doi-asserted-by":"crossref","DOI":"10.1038\/nature09708","article-title":"Mapping structural variation at fine-scale by population genome sequencing","author":"Mills","year":"2011","journal-title":"Nature"},{"key":"2023012511565389900_B15","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol."},{"key":"2023012511565389900_B16","doi-asserted-by":"crossref","first-page":"521","DOI":"10.1016\/S0092-8240(05)80075-8","article-title":"A local algorithm for DNA sequence alignment with inversions","volume":"54","author":"Schoniger","year":"1992","journal-title":"Bull. Math. Biol."},{"key":"2023012511565389900_B17","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol."},{"key":"2023012511565389900_B18","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1101\/gr.080069.108","article-title":"MSB: a mean-shift-based approach for the analysis of structural variation in the genome","volume":"19","author":"Wang","year":"2009","journal-title":"Genome Res."},{"key":"2023012511565389900_B19","doi-asserted-by":"crossref","first-page":"1859","DOI":"10.1093\/bioinformatics\/bti310","article-title":"GMAP: a genomic mapping and alignment program for mRNA and EST sequences","volume":"21","author":"Wu","year":"2005","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/5\/595\/48865462\/bioinformatics_27_5_595.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/5\/595\/48865462\/bioinformatics_27_5_595.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T12:37:31Z","timestamp":1674650251000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/5\/595\/263681"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,1,13]]},"references-count":19,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2011,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq713","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,3,1]]},"published":{"date-parts":[[2011,1,13]]}}}