{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:43:50Z","timestamp":1750308230499,"version":"3.41.0"},"reference-count":17,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2004,6,1]],"date-time":"2004-06-01T00:00:00Z","timestamp":1086048000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGMOD Rec."],"published-print":{"date-parts":[[2004,6]]},"abstract":"<jats:p>As databases become more pervasive through the biological sciences, various data quality issues regarding data legacy, data uniformity and data duplication arise. Due to the nature of this data, each of these problems is non-trivial. For biological data to be corrected and standardized, new methods and frameworks must be developed. This paper proposes one such framework, called BIO-AJAX, which uses principles from data cleaning to improve data quality in biological information systems, specifically in TreeBASE.<\/jats:p>","DOI":"10.1145\/1024694.1024703","type":"journal-article","created":{"date-parts":[[2005,11,9]],"date-time":"2005-11-09T22:23:27Z","timestamp":1131575007000},"page":"51-57","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["BIO-AJAX"],"prefix":"10.1145","volume":"33","author":[{"given":"Katherine G.","family":"Herbert","sequence":"first","affiliation":[{"name":"University Heights, Newark, NJ"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Narain H.","family":"Gehani","sequence":"additional","affiliation":[{"name":"University Heights, Newark, NJ"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"William H.","family":"Piel","sequence":"additional","affiliation":[{"name":"State University of New York at Buffalo, Buffalo, NY"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jason T. L.","family":"Wang","sequence":"additional","affiliation":[{"name":"University Heights, Newark, NJ"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cathy H.","family":"Wu","sequence":"additional","affiliation":[{"name":"Georgetown University Medical Center, NW, Washington"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2004,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/28.1.15"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0168-9525(99)01706-0"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/861869"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0168-9525(01)02348-4"},{"key":"e_1_2_1_5_1","unstructured":"Federhen S. Harrison I. Hotton C. Leipe D. Soussov V. Sternberg R. and Turner S. NCBI Taxonomy Homepage. http:\/\/www.ncbi.nlm.nih.gov\/Taxonomy\/taxonomyhome.html\/.  Federhen S. Harrison I. Hotton C. Leipe D. Soussov V. Sternberg R. and Turner S. NCBI Taxonomy Homepage. http:\/\/www.ncbi.nlm.nih.gov\/Taxonomy\/taxonomyhome.html\/."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkh216"},{"key":"e_1_2_1_7_1","first-page":"371","article-title":"Declarative Data Cleaning: Language, Model and Algorithms","author":"Galahardas H.","year":"2001","journal-title":"Proc. of the 27th International Conference on Very Large Data Bases"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/18.12.1553"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4379(01)00041-2"},{"key":"e_1_2_1_10_1","first-page":"335","volume-title":"A Model Based Mediator System for Scientific Data Management.\" Eds. Z. Lacroix and T. Critchlow","author":"Lud\u00e4scher B.","year":"2003"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btg131"},{"volume-title":"Unordered Tree Mining with Applications to Phylogeny.\" In Proc. of the 20th International Conference on Data Engineering","year":"2004","author":"Shasha D.","key":"e_1_2_1_12_1"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1147\/sj.402.0426"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1093\/protein\/9.5.381"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/28.1.10"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1476-9271(02)00098-1"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkg040"}],"container-title":["ACM SIGMOD Record"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1024694.1024703","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1024694.1024703","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T16:31:35Z","timestamp":1750264295000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1024694.1024703"}},"subtitle":["an extensible framework for biological data cleaning"],"short-title":[],"issued":{"date-parts":[[2004,6]]},"references-count":17,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2004,6]]}},"alternative-id":["10.1145\/1024694.1024703"],"URL":"https:\/\/doi.org\/10.1145\/1024694.1024703","relation":{},"ISSN":["0163-5808"],"issn-type":[{"type":"print","value":"0163-5808"}],"subject":[],"published":{"date-parts":[[2004,6]]},"assertion":[{"value":"2004-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}