{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T00:45:32Z","timestamp":1740185132750,"version":"3.37.3"},"reference-count":17,"publisher":"Oxford University Press (OUP)","issue":"21","license":[{"start":{"date-parts":[[2017,7,21]],"date-time":"2017-07-21T00:00:00Z","timestamp":1500595200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/501100001711","name":"Swiss National Science Foundation","doi-asserted-by":"publisher","award":["P2TIP1_161635","P01 CA016038","P50 CA121974"],"award-info":[{"award-number":["P2TIP1_161635","P01 CA016038","P50 CA121974"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Text and genomic data are composed of sequential tokens, such as words and nucleotides that give rise to higher order syntactic constructs. In this work, we aim at providing a comprehensive Python library implementing conditional random fields (CRFs), a class of probabilistic graphical models, for robust prediction of these constructs from sequential data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Python Sequence Labeling (PySeqLab) is an open source package for performing supervised learning in structured prediction tasks. It implements CRFs models, that is discriminative models from (i) first-order to higher-order linear-chain CRFs, and from (ii) first-order to higher-order semi-Markov CRFs (semi-CRFs). Moreover, it provides multiple learning algorithms for estimating model parameters such as (i) stochastic gradient descent (SGD) and its multiple variations, (ii) structured perceptron with multiple averaging schemes supporting exact and inexact search using \u2018violation-fixing\u2019 framework, (iii) search-based probabilistic online learning algorithm (SAPO) and (iv) an interface for Broyden\u2013Fletcher\u2013Goldfarb\u2013Shanno (BFGS) and the limited-memory BFGS algorithms. Viterbi and Viterbi A* are used for inference and decoding of sequences. Using PySeqLab, we built models (classifiers) and evaluated their performance in three different domains: (i) biomedical Natural language processing (NLP), (ii) predictive DNA sequence analysis and (iii) Human activity recognition (HAR). State-of-the-art performance comparable to machine-learning based systems was achieved in the three domains without feature engineering or the use of knowledge sources.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>PySeqLab is available through https:\/\/bitbucket.org\/A_2\/pyseqlab with tutorials and documentation.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx451","type":"journal-article","created":{"date-parts":[[2017,7,20]],"date-time":"2017-07-20T08:10:32Z","timestamp":1500538232000},"page":"3497-3499","source":"Crossref","is-referenced-by-count":4,"title":["PySeqLab: an open source Python package for sequence labeling and segmentation"],"prefix":"10.1093","volume":"33","author":[{"given":"Ahmed","family":"Allam","sequence":"first","affiliation":[{"name":"Department of Pathology, Yale School of Medicine, New Haven, CT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Krauthammer","sequence":"additional","affiliation":[{"name":"Department of Pathology, Yale School of Medicine, New Haven, CT, USA"},{"name":"Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2017,7,21]]},"reference":[{"key":"2023051506354625800_btx451-B1","first-page":"217","article-title":"Large scale online learning","volume":"16","author":"Bottou","year":"2004","journal-title":"Adv. Neural Inf. Process. Syst"},{"key":"2023051506354625800_btx451-B2","doi-asserted-by":"crossref","first-page":"2033","DOI":"10.1016\/j.patrec.2012.12.014","article-title":"The Opportunity challenge: a benchmark database for on-body sensor-based activity recognition","volume":"34","author":"Chavarriaga","year":"2013","journal-title":"Pattern Recogn. Lett"},{"first-page":"1","year":"2002","author":"Collins","key":"2023051506354625800_btx451-B3"},{"key":"2023051506354625800_btx451-B4","first-page":"981","article-title":"Conditional random field with high-order dependencies for sequence labeling and segmentation","volume":"15","author":"Cuong","year":"2014","journal-title":"J. Mach. Learn. Res"},{"first-page":"142","year":"2012","author":"Huang","key":"2023051506354625800_btx451-B5"},{"issue":"1","key":"2023051506354625800_btx451-B6","first-page":"315","article-title":"Accelerating stochastic gradient descent using predictive variance reduction","volume":"26","author":"Johnson","year":"2013","journal-title":"Adv. Neural Inf. Process. Syst"},{"first-page":"70","year":"2004","author":"Kim","key":"2023051506354625800_btx451-B7"},{"first-page":"282","year":"2001","author":"Lafferty","key":"2023051506354625800_btx451-B8"},{"key":"2023051506354625800_btx451-B9","first-page":"530","article-title":"Training knowledge-based neural networks to recognize genes in DNA sequences","volume":"3","author":"Noordewier","year":"1991","journal-title":"Adv. Neural Inf. Process. Syst"},{"first-page":"1185","year":"2004","author":"Sarawagi","key":"2023051506354625800_btx451-B10"},{"first-page":"12","year":"1990","author":"Soong","key":"2023051506354625800_btx451-B11"},{"year":"2015","author":"Sun","key":"2023051506354625800_btx451-B12"},{"first-page":"477","year":"2009","author":"Tsuruoka","key":"2023051506354625800_btx451-B13"},{"year":"2016","author":"Vieira","key":"2023051506354625800_btx451-B14"},{"key":"2023051506354625800_btx451-B15","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1109\/TIT.1967.1054010","article-title":"Error bounds for convolutional codes and an asymptotically optimum decoding algorithm","volume":"13","author":"Viterbi","year":"1967","journal-title":"IEEE Trans. Inf. Theory"},{"key":"2023051506354625800_btx451-B16","first-page":"2","article-title":"Conditional Random Fields with High-Order Features for Sequence Labeling","volume":"2","author":"Ye","year":"2009","journal-title":"Neural Inf. Process. Syst"},{"year":"2012","author":"Zeiler","key":"2023051506354625800_btx451-B17"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/21\/3497\/50315479\/bioinformatics_33_21_3497.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/21\/3497\/50315479\/bioinformatics_33_21_3497.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T06:36:03Z","timestamp":1684132563000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/21\/3497\/3983261"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2017,7,21]]},"references-count":17,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2017,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx451","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2017,11,1]]},"published":{"date-parts":[[2017,7,21]]}}}