{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T03:27:21Z","timestamp":1775964441548,"version":"3.50.1"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"D1","license":[{"start":{"date-parts":[[2021,11,10]],"date-time":"2021-11-10T00:00:00Z","timestamp":1636502400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["U24HG009446"],"award-info":[{"award-number":["U24HG009446"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The human genome contains \u223c2000 transcriptional regulatory proteins, including \u223c1600 DNA-binding transcription factors (TFs) recognizing characteristic sequence motifs to exert regulatory effects on gene expression. The binding specificities of these factors have been profiled both in vitro, using techniques such as HT-SELEX, and in vivo, using techniques including ChIP-seq. We previously developed Factorbook, a TF-centric database of annotations, motifs, and integrative analyses based on ChIP-seq data from Phase II of the ENCODE Project. Here we present an update to Factorbook which significantly expands the breadth of cell type and TF coverage. The update includes an expanded motif catalog derived from thousands of ENCODE Phase II and III ChIP-seq experiments and HT-SELEX experiments; this motif catalog is integrated with the ENCODE registry of candidate cis-regulatory elements to annotate a comprehensive collection of genome-wide candidate TF binding sites. The database also offers novel tools for applying the motif models within machine learning frameworks and using these models for integrative analysis, including annotation of variants and disease and trait heritability. Factorbook is publicly available at www.factorbook.org; we will continue to expand the resource as ENCODE Phase IV data are released.<\/jats:p>","DOI":"10.1093\/nar\/gkab1039","type":"journal-article","created":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T20:41:57Z","timestamp":1634330517000},"page":"D141-D149","source":"Crossref","is-referenced-by-count":45,"title":["Factorbook: an updated catalog of transcription factor motifs and candidate regulatory motif sites"],"prefix":"10.1093","volume":"50","author":[{"given":"Henry E","family":"Pratt","sequence":"first","affiliation":[{"name":"Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA"}]},{"given":"Gregory R","family":"Andrews","sequence":"additional","affiliation":[{"name":"Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3032-7966","authenticated-orcid":false,"given":"Nishigandha","family":"Phalke","sequence":"additional","affiliation":[{"name":"Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA"}]},{"given":"Jack D","family":"Huey","sequence":"additional","affiliation":[{"name":"Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA"}]},{"given":"Michael\u00a0J","family":"Purcaro","sequence":"additional","affiliation":[{"name":"Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA"}]},{"given":"Arjan","family":"van\u00a0der\u00a0Velde","sequence":"additional","affiliation":[{"name":"Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA"}]},{"given":"Jill E","family":"Moore","sequence":"additional","affiliation":[{"name":"Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA"}]},{"given":"Zhiping","family":"Weng","sequence":"additional","affiliation":[{"name":"Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,11,10]]},"reference":[{"key":"2022041412184043400_B1","doi-asserted-by":"crossref","first-page":"650","DOI":"10.1016\/j.cell.2018.01.029","article-title":"The human transcription factors","volume":"172","author":"Lambert","year":"2018","journal-title":"Cell"},{"key":"2022041412184043400_B2","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1016\/j.cell.2012.12.009","article-title":"DNA-binding specificities of human transcription factors","volume":"152","author":"Jolma","year":"2013","journal-title":"Cell"},{"key":"2022041412184043400_B3","doi-asserted-by":"crossref","first-page":"1497","DOI":"10.1126\/science.1141319","article-title":"Genome-wide mapping of in vivo protein-DNA interactions","volume":"316","author":"Johnson","year":"2007","journal-title":"Science"},{"key":"2022041412184043400_B4","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1038\/nmeth1068","article-title":"Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing","volume":"4","author":"Robertson","year":"2007","journal-title":"Nat. Methods"},{"key":"2022041412184043400_B5","doi-asserted-by":"crossref","first-page":"D252","DOI":"10.1093\/nar\/gkx1106","article-title":"HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis","volume":"46","author":"Kulakovskiy","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2022041412184043400_B6","first-page":"D87","article-title":"JASPAR 2020: update of the open-access database of transcription factor binding profiles","volume":"48","author":"Fornes","year":"2020","journal-title":"Nucleic Acids Res."},{"key":"2022041412184043400_B7","doi-asserted-by":"crossref","first-page":"D77","DOI":"10.1093\/nar\/gkn660","article-title":"UniPROBE: an online database of protein binding microarray data on protein-DNA interactions","volume":"37","author":"Newburger","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2022041412184043400_B8","doi-asserted-by":"crossref","first-page":"1431","DOI":"10.1016\/j.cell.2014.08.009","article-title":"Determination and inference of eukaryotic transcription factor sequence specificity","volume":"158","author":"Weirauch","year":"2014","journal-title":"Cell"},{"key":"2022041412184043400_B9","doi-asserted-by":"crossref","first-page":"1798","DOI":"10.1101\/gr.139105.112","article-title":"Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors","volume":"22","author":"Wang","year":"2012","journal-title":"Genome Res."},{"key":"2022041412184043400_B10","doi-asserted-by":"crossref","first-page":"D171","DOI":"10.1093\/nar\/gks1221","article-title":"Factorbook.org: a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium","volume":"41","author":"Wang","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2022041412184043400_B11","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1016\/j.ymeth.2019.03.020","article-title":"FactorNet: a deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data","volume":"166","author":"Quang","year":"2019","journal-title":"Methods"},{"key":"2022041412184043400_B12","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/s41588-021-00782-6","article-title":"Base-resolution models of transcription-factor binding reveal soft motif syntax","volume":"53","author":"Avsec","year":"2021","journal-title":"Nat. Genet."},{"key":"2022041412184043400_B13","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat. Biotechnol."},{"key":"2022041412184043400_B14","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1038\/s41586-020-2493-4","article-title":"Expanded encyclopaedias of DNA elements in the human and mouse genomes","volume":"583","author":"ENCODE Project Consortium","year":"2020","journal-title":"Nature"},{"key":"2022041412184043400_B15","doi-asserted-by":"crossref","first-page":"1228","DOI":"10.1038\/ng.3404","article-title":"Partitioning heritability by functional annotation using genome-wide association summary statistics","volume":"47","author":"Finucane","year":"2015","journal-title":"Nat. Genet."},{"key":"2022041412184043400_B16","doi-asserted-by":"crossref","first-page":"D260","DOI":"10.1093\/nar\/gkx1126","article-title":"JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework","volume":"46","author":"Khan","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2022041412184043400_B17","doi-asserted-by":"crossref","first-page":"D110","DOI":"10.1093\/nar\/gkv1176","article-title":"JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles","volume":"44","author":"Mathelier","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"2022041412184043400_B18","doi-asserted-by":"crossref","first-page":"D117","DOI":"10.1093\/nar\/gku1045","article-title":"UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein-DNA interactions","volume":"43","author":"Hume","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"2022041412184043400_B19","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1186\/s12859-020-03952-1","article-title":"DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks","volume":"22","author":"Chen","year":"2021","journal-title":"BMC Bioinformatics"},{"key":"2022041412184043400_B20","doi-asserted-by":"crossref","first-page":"W39","DOI":"10.1093\/nar\/gkv416","article-title":"The MEME suite","volume":"43","author":"Bailey","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"2022041412184043400_B21","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1038\/nbt.4314","article-title":"Dimensionality reduction for visualizing single-cell data using UMAP","volume":"37","author":"Becht","year":"2018","journal-title":"Nat. Biotechnol."},{"key":"2022041412184043400_B22","article-title":"UMAP: uniform manifold approximation and projection for dimension reduction","author":"McInnes","year":"2018"},{"key":"2022041412184043400_B23","doi-asserted-by":"crossref","first-page":"R24","DOI":"10.1186\/gb-2007-8-2-r24","article-title":"Quantifying similarity between motifs","volume":"8","author":"Gupta","year":"2007","journal-title":"Genome Biol."},{"key":"2022041412184043400_B24","doi-asserted-by":"crossref","first-page":"eaaj2239","DOI":"10.1126\/science.aaj2239","article-title":"Impact of cytosine methylation on DNA binding specificities of human transcription factors","volume":"356","author":"Yin","year":"2017","journal-title":"Science"},{"key":"2022041412184043400_B25","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1101\/gr.097857.109","article-title":"Detection of nonneutral substitution rates on mammalian phylogenies","volume":"20","author":"Pollard","year":"2010","journal-title":"Genome Res."},{"key":"2022041412184043400_B26","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1101\/gr.136366.111","article-title":"Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements","volume":"22","author":"Kundaje","year":"2012","journal-title":"Genome Res."},{"key":"2022041412184043400_B27","doi-asserted-by":"crossref","first-page":"1553","DOI":"10.1093\/bioinformatics\/btz781","article-title":"Exploiting transfer learning for the reconstruction of the human gene regulatory network","volume":"36","author":"Mignone","year":"2020","journal-title":"Bioinformatics"},{"key":"2022041412184043400_B28","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1186\/s12859-019-3220-8","article-title":"Modeling aspects of the language of life through transfer-learning protein sequences","volume":"20","author":"Heinzinger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2022041412184043400_B29","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1101\/gr.229102","article-title":"The human genome browser at UCSC","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2022041412184043400_B30","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1007\/978-1-4939-1242-1_16","article-title":"SELEX-seq: a method for characterizing the complete repertoire of binding site preferences for transcription factor complexes","volume":"1196","author":"Riley","year":"2014","journal-title":"Methods Mol. Biol."},{"key":"2022041412184043400_B31","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1016\/j.cell.2011.10.053","article-title":"Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins","volume":"147","author":"Slattery","year":"2011","journal-title":"Cell"}],"container-title":["Nucleic Acids Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/nar\/article-pdf\/50\/D1\/D141\/43378557\/gkab1039.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/nar\/article-pdf\/50\/D1\/D141\/43378557\/gkab1039.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,11]],"date-time":"2023-11-11T02:44:01Z","timestamp":1699670641000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/nar\/article\/50\/D1\/D141\/6424766"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,10]]},"references-count":31,"journal-issue":{"issue":"D1","published-online":{"date-parts":[[2021,11,10]]},"published-print":{"date-parts":[[2022,1,7]]}},"URL":"https:\/\/doi.org\/10.1093\/nar\/gkab1039","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.10.11.463518","asserted-by":"object"}]},"ISSN":["0305-1048","1362-4962"],"issn-type":[{"value":"0305-1048","type":"print"},{"value":"1362-4962","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,1,7]]},"published":{"date-parts":[[2021,11,10]]}}}