{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,25]],"date-time":"2025-03-25T15:40:08Z","timestamp":1742917208265,"version":"3.40.3"},"reference-count":25,"publisher":"IGI Global","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,7,1]]},"abstract":"<p>Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and\/or know what is known already about similar sequences. In current web sites and data bases of sequences there are, usually, a set of curated paper references linked to each sequence. Those links are a good starting point to look for relevant information related to a set of sequences. One way to implement such approach is to do a blast with the new decoded sequences, and collect similar sequences. Then one looks at the papers linked with the similar sequences. Most often the number of retrieved papers is small and one has to search large data bases for relevant papers. This paper proposes a process of generating a classifier based on the initially set of relevant papers. First, the authors collect similar sequences using an alignment algorithm like Blast. Then, the authors use the enlarges set of papers to construct a classifier. Finally a classifier is used to automatically enlarge the set of relevant papers by searching the MEDLINE using the automatically constructed classifier.<\/p>","DOI":"10.4018\/jkdb.2011070102","type":"journal-article","created":{"date-parts":[[2012,4,5]],"date-time":"2012-04-05T13:06:15Z","timestamp":1333631175000},"page":"21-36","source":"Crossref","is-referenced-by-count":2,"title":["BioTextRetriever"],"prefix":"10.4018","volume":"2","author":[{"given":"C\u00e9lia Talma","family":"Gon\u00e7alves","sequence":"first","affiliation":[{"name":"Instituto Superior de Contabilidade e Administra\u00e7\u00e3o do Porto & CEISE-STI, Portugal"}]},{"given":"Rui","family":"Camacho","sequence":"additional","affiliation":[{"name":"Universidade do Porto, Portugal"}]},{"given":"Eug\u00e9nio","family":"Oliveira","sequence":"additional","affiliation":[{"name":"Universidade do Porto, Portugal"}]}],"member":"2432","reference":[{"key":"jkdb.2011070102-0","doi-asserted-by":"publisher","DOI":"10.1038\/75556"},{"key":"jkdb.2011070102-1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00058655"},{"key":"jkdb.2011070102-2","doi-asserted-by":"crossref","unstructured":"Caruana, R., Niculescu-Mizil, A., Crew, G., & Ksikes, A. (2004). Ensemble selection from libraries of models. In Proceedings of the Twenty-First International Conference on Machine Learning (p. 18).","DOI":"10.1145\/1015330.1015432"},{"key":"jkdb.2011070102-3","doi-asserted-by":"crossref","unstructured":"Dietterich, T. G. (2000). Ensemble methods in machine learning. In Proceedings of the First International Workshop on Multiple Classifier Systems.","DOI":"10.1007\/3-540-45014-9_1"},{"key":"jkdb.2011070102-4","unstructured":"Divoli, A., Winter, R., Pettifer, S., & Attwood, T. (2005). BioQSpace: An interactive visualisation tool for clustering MEDLINE abstracts. Retrieved from http:\/\/wenku.baidu.com\/view\/06d2c4d9a58da0116d174905"},{"key":"jkdb.2011070102-5","unstructured":"Dollah, R., Seddiqui, M. H., & Aono, M. (2010). The effect of using hierarchical structure for classifying biomedical text abstracts. Retrieved from https:\/\/kaigi.org\/jsai\/webprogram\/2010\/paper-92.html"},{"key":"jkdb.2011070102-6","doi-asserted-by":"publisher","DOI":"10.1023\/B:MACH.0000015881.36452.6e"},{"key":"jkdb.2011070102-7","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4899-4541-9","author":"B.Efron","year":"1993","journal-title":"An introduction to the bootstrap"},{"key":"jkdb.2011070102-8","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/7287.001.0001","author":"C.Fellbaum","year":"1998","journal-title":"WordNet: An electronical lexical database"},{"key":"jkdb.2011070102-9","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.1997.1504"},{"key":"jkdb.2011070102-10","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2010.152"},{"key":"jkdb.2011070102-11","doi-asserted-by":"crossref","unstructured":"Gon\u00e7alves, C. A., Gon\u00e7alves, C. T., Camacho, R., & Oliveira, E. C. (2010, June). The impact of preprocessing on the classification of MEDLINE documents. In Proceedings of the 10th International Workshop on Pattern Recognition in Information Systems, in conjunction with the International Conference on Enterprise Information Systems, Funchal, Madeira, Portugal (pp. 53-61).","DOI":"10.5220\/0003028700530061"},{"key":"jkdb.2011070102-12","doi-asserted-by":"crossref","unstructured":"Gon\u00e7alves, C. T., Camacho, R., & Oliveira, R. (2011). From sequences to papers: An information retrieval exercise. In Proceedings of the 2nd Workshop on Biological Data Mining and its Applications in Healthcare collocated with 10th IEEE International Conference on Data Mining, Vancouver, BC, Canada.","DOI":"10.1109\/ICDMW.2011.184"},{"key":"jkdb.2011070102-13","doi-asserted-by":"publisher","DOI":"10.1145\/1656274.1656278"},{"issue":"5","key":"jkdb.2011070102-14","article-title":"A lazy ensemble learning method to classification.","volume":"7","author":"H.Homayouni","year":"2010","journal-title":"International Journal of Computer Science Issues"},{"key":"jkdb.2011070102-15","doi-asserted-by":"publisher","DOI":"10.5120\/1989-2679"},{"key":"jkdb.2011070102-16","unstructured":"Indra, N., Sarkar, N., Schenk, R., Miller, H., & Norton, C. (2009). LigerCat: using \u201cMeSH Clouds\u201d from journal, article, or gene citations to facilitate the identification of relevant biomedical literature. In Proceedings of the American Medical Informatics Association Annual Symposium (pp. 563-567)."},{"issue":"4","key":"jkdb.2011070102-17","first-page":"324","article-title":"Combining bagging and boosting.","volume":"1","author":"S.Kotsiantis","year":"2004","journal-title":"International Journal of Computational Intelligence"},{"key":"jkdb.2011070102-18","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1613\/jair.614","article-title":"Popular ensemble methods: An empirical study.","volume":"11","author":"D.Opitz","year":"1999","journal-title":"Journal of Artificial Intelligence Research"},{"key":"jkdb.2011070102-19","first-page":"313","article-title":"An algorithm for suffix stripping","author":"M. F.Porter","year":"1997","journal-title":"Reading in information retrieval"},{"key":"jkdb.2011070102-20","unstructured":"Rebholz-Schuhmann, D., Pezik, P., Lee, V., Kim, J.-J., Del Gratta, R., & Sasaki, Y. \u2026Ananiadou, S. (2008). Biolexicon: Towards a reference terminological resource in the biomedical domain. In Proceedings of the 16th Annual International Conference on Intelligent Systems for Molecular Biology."},{"key":"jkdb.2011070102-21","doi-asserted-by":"publisher","DOI":"10.1109\/TCBB.2009.83"},{"key":"jkdb.2011070102-22","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bti475"},{"key":"jkdb.2011070102-23","first-page":"34","article-title":"Database resources of the national center for biotechnology information.","author":"D. L.Wheeler","year":"2006","journal-title":"Nucleic Acids Research"},{"issue":"2","key":"jkdb.2011070102-24","article-title":"A tutorial on information retrieval: basic terms and concepts.","volume":"1","author":"W.Zhou","year":"2006","journal-title":"Journal of Biomedical Discovery and Collaboration"}],"container-title":["International Journal of Knowledge Discovery in Bioinformatics"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=63615","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,25]],"date-time":"2025-03-25T14:35:37Z","timestamp":1742913337000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/jkdb.2011070102"}},"subtitle":["A Tool to Retrieve Relevant Papers"],"short-title":[],"issued":{"date-parts":[[2011,7,1]]},"references-count":25,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2011,7]]}},"URL":"https:\/\/doi.org\/10.4018\/jkdb.2011070102","relation":{},"ISSN":["1947-9115","1947-9123"],"issn-type":[{"type":"print","value":"1947-9115"},{"type":"electronic","value":"1947-9123"}],"subject":[],"published":{"date-parts":[[2011,7,1]]}}}