{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,14]],"date-time":"2025-02-14T05:28:35Z","timestamp":1739510915122,"version":"3.37.0"},"reference-count":13,"publisher":"World Scientific Pub Co Pte Ltd","issue":"06","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Bioinform. Comput. Biol."],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:p>Computational tools are essential components of modern biological research. For example, BLAST searches can be used to identify related proteins based on sequence homology, or when a new genome is sequenced, prediction models can be used to annotate functional sites such as transcription start sites, translation initiation sites and polyadenylation sites and to predict protein localization. Here we present Sirius Prediction Systems Builder (PSB), a new computational tool for sequence analysis, classification and searching. Sirius PSB has four main operations: (1) Building a classifier, (2) Deploying a classifier, (3) Search for proteins similar to query proteins, (4) Preliminary and post-prediction analysis. Sirius PSB supports all these operations via a simple and interactive graphical user interface. Besides being a convenient tool, Sirius PSB has also introduced two novelties in sequence analysis. Firstly, genetic algorithm is used to identify interesting features in the feature space. Secondly, instead of the conventional method of searching for similar proteins via sequence similarity, we introduced searching via features' similarity. To demonstrate the capabilities of Sirius PSB, we have built two prediction models \u2014 one for the recognition of Arabidopsis polyadenylation sites and another for the subcellular localization of proteins. Both systems are competitive against current state-of-the-art models based on evaluation of public datasets. More notably, the time and effort required to build each model is greatly reduced with the assistance of Sirius PSB. Furthermore, we show that under certain conditions when BLAST is unable to find related proteins, Sirius PSB can identify functionally related proteins based on their biophysical similarities. Sirius PSB and its related supplements are available at:<\/jats:p>","DOI":"10.1142\/s0219720009004436","type":"journal-article","created":{"date-parts":[[2009,12,8]],"date-time":"2009-12-08T04:36:16Z","timestamp":1260246976000},"page":"973-990","source":"Crossref","is-referenced-by-count":5,"title":["SIRIUS PSB: A GENERIC SYSTEM FOR ANALYSIS OF BIOLOGICAL SEQUENCES"],"prefix":"10.1142","volume":"07","author":[{"given":"CHUAN HOCK","family":"KOH","sequence":"first","affiliation":[{"name":"School of Computing, National University of Singapore, COM1, Computing Drive, 117417, Singapore"},{"name":"NUS Graduate School for Integrative Sciences and Engineering, 117597, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"SHARENE","family":"LIN","sequence":"additional","affiliation":[{"name":"School of Computing, National University of Singapore, COM1, Computing Drive, 117417, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"GREGORY","family":"JEDD","sequence":"additional","affiliation":[{"name":"Temasek Life Sciences Laboratory and Department of Biological Sciences, National University of Singapore, 117604, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"LIMSOON","family":"WONG","sequence":"additional","affiliation":[{"name":"School of Computing, National University of Singapore, COM1, Computing Drive, 117417, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"219","published-online":{"date-parts":[[2011,11,21]]},"reference":[{"key":"rf1","doi-asserted-by":"publisher","DOI":"10.1038\/nrg1325"},{"key":"rf2","doi-asserted-by":"publisher","DOI":"10.1038\/nature01511"},{"key":"rf3","doi-asserted-by":"publisher","DOI":"10.1038\/nmeth1154"},{"key":"rf4","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bth437"},{"key":"rf7","doi-asserted-by":"crossref","first-page":"255","DOI":"10.3233\/ISB-00132","volume":"4","author":"Liu H.","journal-title":"In. silico. Biol."},{"key":"rf8","doi-asserted-by":"publisher","DOI":"10.1142\/S0219720003000216"},{"volume-title":"Data Mining: Practical Machine Learning Tools and Techniques","year":"2005","author":"Witten I. H.","key":"rf9"},{"key":"rf11","doi-asserted-by":"publisher","DOI":"10.1016\/j.cell.2005.11.047"},{"key":"rf13","doi-asserted-by":"publisher","DOI":"10.1006\/jmbi.2000.3903"},{"key":"rf14","doi-asserted-by":"publisher","DOI":"10.1104\/pp.105.060541"},{"key":"rf15","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-8-43"},{"key":"rf16","doi-asserted-by":"publisher","DOI":"10.1186\/gb-2007-8-12-234"},{"key":"rf17","first-page":"273","volume":"20","author":"Cortes C.","journal-title":"Machine Learning"}],"container-title":["Journal of Bioinformatics and Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0219720009004436","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,13]],"date-time":"2025-02-13T17:55:44Z","timestamp":1739469344000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0219720009004436"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,12]]},"references-count":13,"journal-issue":{"issue":"06","published-online":{"date-parts":[[2011,11,21]]},"published-print":{"date-parts":[[2009,12]]}},"alternative-id":["10.1142\/S0219720009004436"],"URL":"https:\/\/doi.org\/10.1142\/s0219720009004436","relation":{},"ISSN":["0219-7200","1757-6334"],"issn-type":[{"type":"print","value":"0219-7200"},{"type":"electronic","value":"1757-6334"}],"subject":[],"published":{"date-parts":[[2009,12]]}}}