{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:26:13Z","timestamp":1750242373893},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>With the recent advances and availability of various high-throughput sequencing technologies, data on many molecular aspects, such as gene regulation, chromatin dynamics, and the three-dimensional organization of DNA, are rapidly being generated in an increasing number of laboratories. The variation in biological context, and the increasingly dispersed mode of data generation, imply a need for precise, interoperable and flexible representations of genomic features through formats that are easy to parse. A host of alternative formats are currently available and in use, complicating analysis and tool development. The issue of whether and how the multitude of formats reflects varying underlying characteristics of data has to our knowledge not previously been systematically treated.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We here identify intrinsic distinctions between genomic features, and argue that the distinctions imply that a certain variation in the representation of features as genomic tracks is warranted. Four core informational properties of tracks are discussed: gaps, lengths, values and interconnections. From this we delineate fifteen generic track types. Based on the track type distinctions, we characterize major existing representational formats and find that the track types are not adequately supported by any single format. We also find, in contrast to the XML formats, that none of the existing tabular formats are conveniently extendable to support all track types. We thus propose two unified formats for track data, an improved XML format, BioXSD 1.1, and a new tabular format, GTrack 1.0.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>The defined track types are shown to capture relevant distinctions between genomic annotation tracks, resulting in varying representational needs and analysis possibilities. The proposed formats, GTrack 1.0 and BioXSD 1.1, cater to the identified track distinctions and emphasize preciseness, flexibility and parsing convenience.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-12-494","type":"journal-article","created":{"date-parts":[[2011,12,30]],"date-time":"2011-12-30T19:20:27Z","timestamp":1325272827000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["Identifying elemental genomic track types and representing them uniformly"],"prefix":"10.1186","volume":"12","author":[{"given":"Sveinung","family":"Gundersen","sequence":"first","affiliation":[]},{"given":"Mat\u00fa\u0161","family":"Kala\u0161","sequence":"additional","affiliation":[]},{"given":"Osman","family":"Abul","sequence":"additional","affiliation":[]},{"given":"Arnoldo","family":"Frigessi","sequence":"additional","affiliation":[]},{"given":"Eivind","family":"Hovig","sequence":"additional","affiliation":[]},{"given":"Geir Kjetil","family":"Sandve","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,12,30]]},"reference":[{"issue":"5950","key":"5126_CR1","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1126\/science.1181369","volume":"326","author":"E Lieberman-Aiden","year":"2009","unstructured":"Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 2009, 326(5950):289\u2013293. 10.1126\/science.1181369","journal-title":"Science"},{"key":"5126_CR2","unstructured":"Generic Feature Format version 3[http:\/\/www.sequenceontology.org\/gff3.shtml]"},{"issue":"6","key":"5126_CR3","doi-asserted-by":"publisher","first-page":"996","DOI":"10.1101\/gr.229102. Article published online before print in May 2002","volume":"12","author":"WJ Kent","year":"2002","unstructured":"Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D: The human genome browser at UCSC. Genome Res 2002, 12(6):996\u20131006.","journal-title":"Genome Res"},{"key":"5126_CR4","unstructured":"UCSC genome browser data formats[http:\/\/genome.ucsc.edu\/FAQ\/FAQformat.html]"},{"key":"5126_CR5","unstructured":"Definition of Gene Transfer Format[http:\/\/mblab.wustl.edu\/GTF22.html]"},{"issue":"8","key":"5126_CR6","doi-asserted-by":"publisher","first-page":"R88","DOI":"10.1186\/gb-2010-11-8-r88","volume":"11","author":"MG Reese","year":"2010","unstructured":"Reese MG, Moore B, Batchelor C, Salas F, Cunningham F, Marth GT, Stein L, Flicek P, Yandell M, Eilbeck K: A standard variation file format for human genome sequences. Genome Biol 2010, 11(8):R88. 10.1186\/gb-2010-11-8-r88","journal-title":"Genome Biol"},{"key":"5126_CR7","volume-title":"PLoS Comput Biol","author":"F Liu","year":"2007","unstructured":"Liu F, Tostesen E, Sundet JK, Jenssen TK, Bock C, Jerstad GI, Thilly WG, Hovig E: The human genomic melting map. PLoS Comput Biol 2007., 3(5):"},{"key":"5126_CR8","unstructured":"Definition of Wiggle Track Format[http:\/\/genome.ucsc.edu\/goldenPath\/help\/wiggle.html]"},{"key":"5126_CR9","unstructured":"The Sequence Ontology[http:\/\/www.sequenceontology.org]"},{"issue":"12","key":"5126_CR10","doi-asserted-by":"publisher","first-page":"R121","DOI":"10.1186\/gb-2010-11-12-r121","volume":"11","author":"GK Sandve","year":"2010","unstructured":"Sandve GK, Gundersen S, Rydbeck H, Glad IK, Holden L, Holden M, Liestol K, Clancy T, Ferkingstad E, Johansen M, Nygaard V, Tostesen E, Frigessi A, Hovig E: The Genomic HyperBrowser: inferential genomics at the sequence level. Genome Biol 2010, 11(12):R121. 10.1186\/gb-2010-11-12-r121","journal-title":"Genome Biol"},{"issue":"16","key":"5126_CR11","doi-asserted-by":"publisher","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","volume":"25","author":"H Li","year":"2009","unstructured":"Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The Sequence Alignment\/Map format and SAMtools. Bioinformatics 2009, 25(16):2078\u20132079. 10.1093\/bioinformatics\/btp352","journal-title":"Bioinformatics"},{"key":"5126_CR12","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1186\/1471-2105-2-7","volume":"2","author":"RD Dowell","year":"2001","unstructured":"Dowell RD, Jokerst RM, Day A, Eddy SR, Stein L: The distributed annotation system. BMC Bioinformatics 2001, 2: 7. 10.1186\/1471-2105-2-7","journal-title":"BMC Bioinformatics"},{"key":"5126_CR13","unstructured":"Web services provided by the Center for Biological Sequence analysis (CBS), Technical University of Denmark[http:\/\/www.cbs.dtu.dk\/ws\/]"},{"key":"5126_CR14","first-page":"D142","volume-title":"Nucleic Acids Res","author":"C UniProt","year":"2010","unstructured":"UniProt C: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res 2010, (38 Database Issue):D142\u20138."},{"key":"5126_CR15","first-page":"D167","volume-title":"Nucleic Acids Res","author":"CM Gould","year":"2010","unstructured":"Gould CM, Diella F, Via A, Puntervoll P, Gemund C, Chabanis-Davidson S, Michael S, Sayadi A, Bryne JC, Chica C, Seiler M, Davey NE, Haslam N, Weatheritt RJ, Budd A, Hughes T, Pas J, Rychlewski L, Trave G, Aasland R, Helmer-Citterich M, Linding R, Gibson TJ: ELM: the status of the 2010 eukaryotic linear motif resource. Nucleic Acids Res 2010, (38 Database Issue):D167\u201380."},{"issue":"18","key":"5126_CR16","doi-asserted-by":"publisher","first-page":"i540","DOI":"10.1093\/bioinformatics\/btq391","volume":"26","author":"M Kalas","year":"2010","unstructured":"Kalas M, Puntervoll P, Joseph A, Bartaseviciute E, Topfer A, Venkataraman P, Pettifer S, Bryne JC, Ison J, Blanchet C, Rapacki K, Jonassen I: BioXSD: the common data-exchange format for everyday bioinformatics web services. Bioinformatics 2010, 26(18):i540\u20136. 10.1093\/bioinformatics\/btq391","journal-title":"Bioinformatics"},{"key":"5126_CR17","unstructured":"Efficient XML Interchange (EXI) Format 1.0[http:\/\/www.w3.org\/TR\/2011\/REC-exi-20110310]"},{"issue":"17","key":"5126_CR18","doi-asserted-by":"publisher","first-page":"2204","DOI":"10.1093\/bioinformatics\/btq351","volume":"26","author":"WJ Kent","year":"2010","unstructured":"Kent WJ, Zweig AS, Barber G, Hinrichs AS, Karolchik D: BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 2010, 26(17):2204\u20132207. 10.1093\/bioinformatics\/btq351","journal-title":"Bioinformatics"},{"key":"5126_CR19","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1186\/1471-2105-9-523","volume":"9","author":"DA Nix","year":"2008","unstructured":"Nix DA, Courdy SJ, Boucher KM: Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks. BMC Bioinformatics 2008, 9: 523. 10.1186\/1471-2105-9-523","journal-title":"BMC Bioinformatics"},{"key":"5126_CR20","unstructured":"GTrack[http:\/\/www.gtrack.no]"},{"key":"5126_CR21","unstructured":"BioXSD example 1[http:\/\/bioxsd.org\/trackExample1.xml]"},{"key":"5126_CR22","unstructured":"BioXSD example 2[http:\/\/bioxsd.org\/trackExample2.xml]"},{"key":"5126_CR23","unstructured":"BioXSD example 3[http:\/\/bioxsd.org\/trackExample3.xml]"},{"key":"5126_CR24","unstructured":"BioXSD example 4[http:\/\/bioxsd.org\/trackExample4.xml]"},{"key":"5126_CR25","unstructured":"BioXSD example 5[http:\/\/bioxsd.org\/trackExample5.xml]"},{"key":"5126_CR26","unstructured":"Definition of BioXSD version 1.1[http:\/\/bioxsd.org\/BioXSD-1.1.xsd]"},{"key":"5126_CR27","unstructured":"BioXSD.org[http:\/\/bioxsd.org]"},{"key":"5126_CR28","unstructured":"The Genomic HyperBrowser[http:\/\/hyperbrowser.uio.no]"},{"key":"5126_CR29","unstructured":"Creative Commons Attribution-NoDerivs 3.0 Unported License (CC BY-ND 3.0)[http:\/\/creativecommons.org\/licenses\/by-nd\/3.0\/]"},{"issue":"8","key":"5126_CR30","doi-asserted-by":"publisher","first-page":"R86","DOI":"10.1186\/gb-2010-11-8-r86","volume":"11","author":"J Goecks","year":"2010","unstructured":"Goecks J, Nekrutenko A, Taylor J: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 2010, 11(8):R86. 10.1186\/gb-2010-11-8-r86","journal-title":"Genome Biol"},{"key":"5126_CR31","first-page":"21","volume-title":"Curr Protoc Mol Biol","author":"D Blankenberg","year":"2010","unstructured":"Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, Mangan M, Nekrutenko A, Taylor J: Galaxy: a web-based genome analysis tool for experimentalists. Curr Protoc Mol Biol 2010, 19:-21. Unit 19.10.1 Unit 19.10.1"},{"key":"5126_CR32","volume-title":"Guide to NumPy","author":"T Oliphant","year":"2006","unstructured":"Oliphant T: Guide to NumPy. Trelgol Trelgol Publishing; 2006."},{"key":"5126_CR33","unstructured":"The Python Language Reference[http:\/\/docs.python.org\/release\/2.7.2\/reference\/index.html]"},{"key":"5126_CR34","unstructured":"GNU General Public License, version 3[http:\/\/www.gnu.org\/copyleft\/gpl.html]"},{"issue":"5","key":"5126_CR35","doi-asserted-by":"publisher","first-page":"718","DOI":"10.1093\/bioinformatics\/btq671","volume":"27","author":"H Li","year":"2011","unstructured":"Li H: Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 2011, 27(5):718\u2013719. 10.1093\/bioinformatics\/btq671","journal-title":"Bioinformatics"},{"key":"5126_CR36","unstructured":"Affymetrix CNT File Format[http:\/\/goldenhelix.com\/SNP_Variation\/Manual\/svs7\/affymetrix_cnt_file_format.html]"},{"key":"5126_CR37","unstructured":"VCF (Variant Call Format) version 4.1[http:\/\/www.1000genomes.org\/wiki\/Analysis\/Variant%20Call%20Format\/vcf-variant-call-format-version-41]"},{"key":"5126_CR38","unstructured":"The SAM Format Specification (v1.4-r985)[http:\/\/samtools.sourceforge.net\/SAM1.pdf]"},{"key":"5126_CR39","unstructured":"BioHDF[http:\/\/www.hdfgroup.org\/projects\/biohdf\/]"},{"key":"5126_CR40","unstructured":"FASTA[http:\/\/www.ncbi.nlm.nih.gov\/BLAST\/blastcgihelp.shtml]"},{"issue":"11","key":"5126_CR41","doi-asserted-by":"publisher","first-page":"1458","DOI":"10.1093\/bioinformatics\/btq164","volume":"26","author":"MM Hoffman","year":"2010","unstructured":"Hoffman MM, Buske OJ, Noble WS: The Genomedata format for storing large-scale functional genomics data. Bioinformatics 2010, 26(11):1458\u20131459. 10.1093\/bioinformatics\/btq164","journal-title":"Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-12-494.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T18:34:33Z","timestamp":1630521273000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-12-494"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,12]]},"references-count":41,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["5126"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-12-494","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,12]]},"assertion":[{"value":"11 May 2011","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 December 2011","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 December 2011","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"494"}}