{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T06:49:29Z","timestamp":1772520569358,"version":"3.50.1"},"reference-count":27,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2023,1,5]],"date-time":"2023-01-05T00:00:00Z","timestamp":1672876800000},"content-version":"vor","delay-in-days":4,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["U19MH114830"],"award-info":[{"award-number":["U19MH114830"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper\u2019s DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq\u2019s modularity and simplicity make it extensible to any genomic database exposing its data for programmatic access.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>ffq is free and open source, and the code can be found here: https:\/\/github.com\/pachterlab\/ffq.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac667","type":"journal-article","created":{"date-parts":[[2022,10,7]],"date-time":"2022-10-07T17:11:22Z","timestamp":1665162682000},"source":"Crossref","is-referenced-by-count":24,"title":["Metadata retrieval from sequence databases with\n                    <i>ffq<\/i>"],"prefix":"10.1093","volume":"39","author":[{"given":"\u00c1ngel","family":"G\u00e1lvez-Merch\u00e1n","sequence":"first","affiliation":[{"name":"Division of Biology and Biological Engineering, California Institute of Technology , Pasadena, CA 91125, USA"}]},{"given":"Kyung Hoi (Joseph)","family":"Min","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Electrical Engineering, Massachusetts Institute of Technology , Cambridge, MA 91125, USA"}]},{"given":"Lior","family":"Pachter","sequence":"additional","affiliation":[{"name":"Division of Biology and Biological Engineering, California Institute of Technology , Pasadena, CA 91125, USA"},{"name":"Department of Computing and Mathematical Sciences , Pasadena, CA 91125, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6442-4502","authenticated-orcid":false,"given":"A Sina","family":"Booeshaghi","sequence":"additional","affiliation":[{"name":"Department of Mechanical Engineering, California Institute of Technology , Pasadena, CA 91125, USA"}]}],"member":"286","published-online":{"date-parts":[[2023,1,5]]},"reference":[{"key":"2023012806281127600_btac667-B1","doi-asserted-by":"crossref","first-page":"376","DOI":"10.12688\/f1000research.23180.2","article-title":"Jupyter notebook-based tools for building structured datasets from the Sequence Read Archive","volume":"9","author":"Bernstein","year":"2020","journal-title":"F1000Res"},{"key":"2023012806281127600_btac667-B2","doi-asserted-by":"crossref","first-page":"2914","DOI":"10.1093\/bioinformatics\/btx334","article-title":"MetaSRA: normalized human sample-specific metadata for the Sequence Read Archive","volume":"33","author":"Bernstein","year":"2017","journal-title":"Bioinformatics"},{"key":"2023012806281127600_btac667-B3","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1016\/j.gpb.2021.08.001","article-title":"The genome sequence archive family: toward explosive data growth and diverse data types","volume":"19","author":"Chen","year":"2021","journal-title":"Genomics Proteomics Bioinformatics"},{"key":"2023012806281127600_btac667-B4","doi-asserted-by":"crossref","first-page":"532","DOI":"10.12688\/f1000research.18676.1","article-title":"pysradb: a Python package to query next-generation sequencing metadata and data from NCBI Sequence Read Archive","volume":"8","author":"Choudhary","year":"2019","journal-title":"F1000Res"},{"key":"2023012806281127600_btac667-B5","doi-asserted-by":"crossref","first-page":"D27","DOI":"10.1093\/nar\/gkab951","article-title":"Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2022","volume":"50","author":"CNCB-NGDC Members and Partners","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2023012806281127600_btac667-B6","doi-asserted-by":"crossref","first-page":"D794","DOI":"10.1093\/nar\/gkx1081","article-title":"The Encyclopedia of DNA Elements (ENCODE): data portal update","volume":"46","author":"Davis","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023012806281127600_btac667-B7","doi-asserted-by":"crossref","first-page":"1990","DOI":"10.21105\/joss.01990","article-title":"NCBImeta: efficient and comprehensive metadata retrieval from NCBI databases","volume":"5","author":"Eaton","year":"2020","journal-title":"J. Open Source Softw"},{"key":"2023012806281127600_btac667-B8","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nature11247","article-title":"An integrated encyclopedia of DNA elements in the human genome","volume":"489","author":"ENCODE Project Consortium","year":"2012","journal-title":"Nature"},{"key":"2023012806281127600_btac667-B9","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1162\/qss_a_00022","article-title":"Crossref: the sustainable source of community-owned scholarly metadata","volume":"1","author":"Hendricks","year":"2020","journal-title":"Quant. Sci. Stud"},{"key":"2023012806281127600_btac667-B10","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1016\/j.trecan.2020.10.011","article-title":"Expanding and remixing the metadata landscape","volume":"7","author":"Hippen","year":"2021","journal-title":"Trends Cancer Res"},{"key":"2023012806281127600_btac667-B11","author":"Huang","year":"2021"},{"key":"2023012806281127600_btac667-B12","doi-asserted-by":"crossref","first-page":"D743","DOI":"10.1093\/nar\/gkaa1031","article-title":"HumanMetagenomeDB: a public repository of curated and standardized metadata for human metagenomes","volume":"49","author":"Kasmanas","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2023012806281127600_btac667-B13","first-page":"1","article-title":"Increasing metadata coverage of SRA BioSample entries using deep learning-based named entity recognition","author":"Klie","year":"2021","journal-title":"Database"},{"key":"2023012806281127600_btac667-B14","first-page":"1","article-title":"GEOMetaCuration: a web-based application for accurate manual curation of Gene Expression Omnibus metadata","author":"Li","year":"2018","journal-title":"Database"},{"key":"2023012806281127600_btac667-B15","author":"Luebbert","year":"2023"},{"key":"2023012806281127600_btac667-B16","doi-asserted-by":"crossref","first-page":"e1007450","DOI":"10.1371\/journal.pcbi.1007450","article-title":"Maximizing the reusability of gene expression data by predicting missing metadata","volume":"16","author":"Lung","year":"2020","journal-title":"PLoS Comput. Biol"},{"key":"2023012806281127600_btac667-B17","doi-asserted-by":"crossref","first-page":"7580","DOI":"10.1038\/s41598-019-43935-8","article-title":"GREIN: an interactive web platform for re-analyzing GEO RNA-seq data","volume":"9","author":"Mahi","year":"2019","journal-title":"Sci. Rep"},{"key":"2023012806281127600_btac667-B18","doi-asserted-by":"crossref","first-page":"1899","DOI":"10.1002\/j.1538-7305.1978.tb02135.x","article-title":"UNIX time-sharing system","volume":"57","author":"McIlroy","year":"1978","journal-title":"Bell Syst. Techn. J"},{"key":"2023012806281127600_btac667-B19","first-page":"813","author":"Melsted","year":"2021"},{"key":"2023012806281127600_btac667-B20","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1186\/s13059-021-02332-z","article-title":"Improving the completeness of public metadata accompanying omics studies","volume":"22","author":"Rajesh","year":"2021","journal-title":"Genome Biol"},{"key":"2023012806281127600_btac667-B21","author":"Razmara","year":"2019"},{"key":"2023012806281127600_btac667-B22","author":"Simon","year":"2018"},{"key":"2023012806281127600_btac667-B23","author":"Booeshaghi","year":"2022"},{"key":"2023012806281127600_btac667-B24","author":"Booeshaghi","year":"2020"},{"key":"2023012806281127600_btac667-B25","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1007\/s12551-018-0490-8","article-title":"Mining data and metadata from the gene expression omnibus","volume":"11","author":"Wang","year":"2019","journal-title":"Biophys. Rev"},{"key":"2023012806281127600_btac667-B26","doi-asserted-by":"crossref","DOI":"10.1093\/gigascience\/giab064","article-title":"Bias-invariant RNA-sequencing metadata annotation","volume":"10","author":"Wartmann","year":"2021","journal-title":"Gigascience"},{"key":"2023012806281127600_btac667-B27","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1186\/1471-2105-14-19","article-title":"SRAdb: query and use public next-generation sequencing data from within R","volume":"14","author":"Zhu","year":"2013","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac667\/48514738\/btac667.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/1\/btac667\/48942763\/btac667.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/1\/btac667\/48942763\/btac667.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,28]],"date-time":"2023-01-28T02:50:51Z","timestamp":1674874251000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btac667\/6971839"}},"subtitle":[],"editor":[{"given":"Peter","family":"Robinson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,1,1]]},"references-count":27,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac667","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.05.18.492548","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,1,1]]},"published":{"date-parts":[[2023,1,1]]},"article-number":"btac667"}}