{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:44:46Z","timestamp":1753875886524,"version":"3.41.2"},"reference-count":13,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2024,1,4]],"date-time":"2024-01-04T00:00:00Z","timestamp":1704326400000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,1,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Summary<\/jats:title>\n                  <jats:p>We created bigwig-loader, a data-loader for epigenetic profiles from BigWig files that decompresses and processes information for multiple intervals from multiple BigWig files in parallel. This is an access pattern needed to create training batches for typical machine learning models on epigenetics data. Using a new codec, the decompression can be done on a graphical processing unit (GPU) making it fast enough to create the training batches during training, mitigating the need for saving preprocessed training examples to disk.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The bigwig-loader installation instructions and source code can be accessed at https:\/\/github.com\/pfizer-opensource\/bigwig-loader<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad767","type":"journal-article","created":{"date-parts":[[2024,1,4]],"date-time":"2024-01-04T17:59:13Z","timestamp":1704391153000},"source":"Crossref","is-referenced-by-count":1,"title":["A fast machine learning dataloader for epigenetic tracks from BigWig files"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3316-5525","authenticated-orcid":false,"given":"Joren Sebastian","family":"Retel","sequence":"first","affiliation":[{"name":"Machine Learning Research, Pfizer Worldwide Research Development and Medical , Friedrichstra\u00dfe 110 , Berlin 10117, Germany"}]},{"given":"Andreas","family":"Poehlmann","sequence":"additional","affiliation":[{"name":"Machine Learning Research, Pfizer Worldwide Research Development and Medical , Friedrichstra\u00dfe 110 , Berlin 10117, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4618-0647","authenticated-orcid":false,"given":"Josh","family":"Chiou","sequence":"additional","affiliation":[{"name":"Machine Learning Research, Pfizer Worldwide Research Development and Medical , Friedrichstra\u00dfe 110 , Berlin 10117, Germany"}]},{"given":"Andreas","family":"Steffen","sequence":"additional","affiliation":[{"name":"Machine Learning Research, Pfizer Worldwide Research Development and Medical , Friedrichstra\u00dfe 110 , Berlin 10117, Germany"}]},{"given":"Djork-Arn\u00e9","family":"Clevert","sequence":"additional","affiliation":[{"name":"Machine Learning Research, Pfizer Worldwide Research Development and Medical , Friedrichstra\u00dfe 110 , Berlin 10117, Germany"}]}],"member":"286","published-online":{"date-parts":[[2024,1,4]]},"reference":[{"year":"2015","author":"Abadi","key":"2024011113063890500_btad767-B1"},{"key":"2024011113063890500_btad767-B2","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/s41588-021-00782-6","article-title":"Base-resolution models of transcription-factor binding reveal soft motif syntax","volume":"53","author":"Avsec","year":"2021","journal-title":"Nat Genet"},{"key":"2024011113063890500_btad767-B3","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1038\/s41592-019-0360-8","article-title":"Selene: a PyTorch-based deep learning library for sequence data","volume":"16","author":"Chen","year":"2019","journal-title":"Nat Methods"},{"key":"2024011113063890500_btad767-B4","doi-asserted-by":"crossref","first-page":"990","DOI":"10.1101\/gr.200535.115","article-title":"Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks","volume":"26","author":"Kelley","year":"2016","journal-title":"Genome Res"},{"year":"2018","author":"Kelley","key":"2024011113063890500_btad767-B5"},{"key":"2024011113063890500_btad767-B6","doi-asserted-by":"crossref","first-page":"2204","DOI":"10.1093\/bioinformatics\/btq351","article-title":"BigWig and BigBed: enabling browsing of large distributed datasets","volume":"26","author":"Kent","year":"2010","journal-title":"Bioinformatics"},{"key":"2024011113063890500_btad767-B7","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1186\/s13059-023-02985-y","article-title":"ExplaiNN: interpretable and transparent neural networks for genomics","volume":"24","author":"Novakovsky","year":"2023","journal-title":"Genome Biol"},{"year":"2017","author":"Okuta","key":"2024011113063890500_btad767-B8"},{"first-page":"8024","year":"2019","author":"Paszke","key":"2024011113063890500_btad767-B9"},{"year":"2023","author":"Ryan","key":"2024011113063890500_btad767-B10"},{"year":"2015","author":"Shirley","key":"2024011113063890500_btad767-B11"},{"key":"2024011113063890500_btad767-B12","doi-asserted-by":"crossref","first-page":"1088","DOI":"10.1038\/s42256-022-00570-9","article-title":"Evaluating deep learning for predicting epigenomic profiles","volume":"4","author":"Toneyan","year":"2022","journal-title":"Nat Mach Intell"},{"key":"2024011113063890500_btad767-B13","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1038\/nmeth.3547","article-title":"Predicting effects of noncoding variants with deep learning-based sequence model","volume":"12","author":"Zhou","year":"2015","journal-title":"Nat Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad767\/55024486\/btad767.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/1\/btad767\/55432088\/btad767.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/1\/btad767\/55432088\/btad767.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,11]],"date-time":"2024-01-11T13:13:14Z","timestamp":1704978794000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad767\/7510837"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,1,1]]},"references-count":13,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,1,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad767","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2024,1,1]]},"published":{"date-parts":[[2024,1,1]]},"article-number":"btad767"}}