{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,6,3]],"date-time":"2024-06-03T11:55:12Z","timestamp":1717415712373},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2013,8,29]]},"abstract":"<jats:p>A growing number of resources are available for enriching documents with semantic annotations. While originally focused on a few standard classes of annotations, the ecosystem of annotators is now becoming increasingly diverse. Although annotators often have very different vocabularies, with both high-level and specialist concepts, they also have many semantic interconnections. We will show that both the overlap and the diversity in annotator vocabularies motivate the need for semantic annotation integration: middleware that produces a unified annotation on top of diverse semantic annotators. On the one hand, the diversity of vocabulary allows applications to benefit from the much richer vocabulary available in an integrated vocabulary. On the other hand, we present evidence that the most widely-used annotators on the web suffer from serious accuracy deficiencies: the overlap in vocabularies from individual annotators allows an integrated annotator to boost accuracy by exploiting inter-annotator agreement and disagreement.<\/jats:p><jats:p>The integration of semantic annotations leads to new challenges, both compared to usual data integration scenarios and to standard aggregation of machine learning tools. We overview an approach to these challenges that performs ontology-aware aggregation. We introduce an approach that requires no training data, making use of ideas from database repair. We experimentally compare this with a supervised approach, which adapts maximal entropy Markov models to the setting of ontology-based annotations. We further experimentally compare both these approaches with respect to ontology-unaware supervised approaches, and to individual annotators.<\/jats:p>","DOI":"10.14778\/2536258.2536261","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"1486-1497","source":"Crossref","is-referenced-by-count":12,"title":["Aggregating semantic annotators"],"prefix":"10.14778","volume":"6","author":[{"given":"Luying","family":"Chen","sequence":"first","affiliation":[{"name":"Oxford University, UK"}]},{"given":"Stefano","family":"Ortona","sequence":"additional","affiliation":[{"name":"Oxford University, UK"}]},{"given":"Giorgio","family":"Orsi","sequence":"additional","affiliation":[{"name":"Oxford University, UK"}]},{"given":"Michael","family":"Benedikt","sequence":"additional","affiliation":[{"name":"Oxford University, UK"}]}],"member":"320","published-online":{"date-parts":[[2013,8]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"FOX. http:\/\/ontowiki.net\/Projects\/FOX?v=4e5. FOX. http:\/\/ontowiki.net\/Projects\/FOX?v=4e5."},{"key":"e_1_2_1_2_1","unstructured":"LingPipe. http:\/\/alias-i.com\/lingpipe\/. LingPipe. http:\/\/alias-i.com\/lingpipe\/."},{"key":"e_1_2_1_3_1","unstructured":"MUC7. http:\/\/www.ldc.upenn.edu\/Catalog\/catalogEntry.jsp?catalogId=LDC2001T02. MUC7. http:\/\/www.ldc.upenn.edu\/Catalog\/catalogEntry.jsp?catalogId=LDC2001T02."},{"key":"e_1_2_1_4_1","unstructured":"OpenNLP Tools. http:\/\/opennlp.apache.org\/index.html. OpenNLP Tools. http:\/\/opennlp.apache.org\/index.html."},{"key":"e_1_2_1_5_1","unstructured":"Reuters. http:\/\/about.reuters.com\/researchandstandards\/corpus\/index.asp. Reuters. http:\/\/about.reuters.com\/researchandstandards\/corpus\/index.asp."},{"key":"e_1_2_1_6_1","unstructured":"ROSeAnn. http:\/\/diadem.cs.ox.ac.uk\/roseann. ROSeAnn. http:\/\/diadem.cs.ox.ac.uk\/roseann."},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1145\/383952.384007","volume-title":"SIGIR","author":"Aslam J. A.","year":"2001"},{"key":"e_1_2_1_8_1","first-page":"148","volume-title":"CoNLL","author":"Bender O.","year":"2003"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1016\/j.is.2008.01.005","article-title":"The complexity and approximation of fixing numerical attributes in databases under integrity constraints","volume":"33","author":"Bertossi L.","year":"2008","journal-title":"Inf. Sys."},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-12-460","article-title":"Semantic annotation of biological concepts interplaying microbial cellular responses","volume":"12","author":"Carreira R.","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","unstructured":"L. Chen S. Ortona G. Orsi and M. Benedikt. ROSeAnn: Reconciling opinions of semantic annotators. PVLDB To appear 2013. L. Chen S. Ortona G. Orsi and M. Benedikt. ROSeAnn: Reconciling opinions of semantic annotators. PVLDB To appear 2013.","DOI":"10.1145\/2567948.2578038"},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","first-page":"462","DOI":"10.1145\/988672.988735","volume-title":"WWW","author":"Cimiano P.","year":"2004"},{"issue":"2","key":"e_1_2_1_13_1","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1017\/S135132490400333X","article-title":"LearningPinocchio: Adaptive information extraction for real world applications","volume":"10","author":"Ciravegna F.","year":"2004","journal-title":"Nat. Lang. Eng."},{"issue":"4","key":"e_1_2_1_14_1","first-page":"219","article-title":"Automatic wrappers for large scale web extraction","volume":"4","author":"Dalvi N.","year":"2011","journal-title":"PVLDB"},{"issue":"1","key":"e_1_2_1_15_1","first-page":"550","article-title":"Integrating conflicting data: the role of source dependence","volume":"2","author":"Dong X. L.","year":"2009","journal-title":"PVLDB"},{"key":"e_1_2_1_16_1","first-page":"1226","volume-title":"IJCNN","author":"Duong D.","year":"2006"},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1145\/988672.988687","volume-title":"WWW","author":"Etzioni O.","year":"2004"},{"key":"e_1_2_1_18_1","first-page":"348","volume-title":"IJCAI","author":"Euzenat J.","year":"2007"},{"issue":"2","key":"e_1_2_1_19_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1735886.1735893","article-title":"Querying and repairing inconsistent numerical databases","volume":"35","author":"Flesca S.","year":"2010","journal-title":"ACM Trans. Database Syst."},{"key":"e_1_2_1_20_1","first-page":"168","volume-title":"CoNLL","author":"Florian R.","year":"2003"},{"key":"e_1_2_1_21_1","first-page":"131","volume-title":"WSDM","author":"Galland A.","year":"2010"},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1007\/978-3-642-31485-8_5","volume-title":"Lectures on Logic and Computation","author":"Grossi D.","year":"2012"},{"key":"e_1_2_1_23_1","volume-title":"Sheffield Dept. of CS","author":"H. Cunningham","year":"2011"},{"issue":"2","key":"e_1_2_1_24_1","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1093\/logcom\/exp079","article-title":"Reliable methods of judgement aggregation","volume":"20","author":"Hartmann S.","year":"2010","journal-title":"J. Log. Comp."},{"key":"e_1_2_1_25_1","first-page":"275","volume-title":"ICML","author":"Kakade S.","year":"2002"},{"key":"e_1_2_1_26_1","first-page":"460","volume-title":"COLING","author":"Kambhatla N.","year":"2006"},{"issue":"1","key":"e_1_2_1_27_1","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1016\/j.websem.2004.07.005","article-title":"Semantic annotation, indexing, and retrieval. Web Semantics: Science","volume":"2","author":"Kiryakov A.","year":"2004","journal-title":"Services and Agents on the World Wide Web"},{"issue":"3","key":"e_1_2_1_28_1","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1016\/j.datak.2006.06.014","article-title":"Combining data-driven systems for improving named entity recognition","volume":"61","author":"Kozareva Z.","year":"2007","journal-title":"Data Knowl. Eng."},{"key":"e_1_2_1_29_1","first-page":"591","volume-title":"ICML","author":"McCallum A.","year":"2000"},{"key":"e_1_2_1_30_1","first-page":"101","volume-title":"SIGMOD","author":"Michelakis E.","year":"2009"},{"key":"e_1_2_1_31_1","doi-asserted-by":"crossref","first-page":"147","DOI":"10.3115\/1596374.1596399","volume-title":"CoNLL","author":"Ratinov L.","year":"2009"},{"key":"e_1_2_1_32_1","first-page":"73","volume-title":"EACL","author":"Rizzo G.","year":"2012"},{"key":"e_1_2_1_33_1","first-page":"1057","volume-title":"IJCAI","author":"Rosati R.","year":"2011"},{"key":"e_1_2_1_34_1","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1145\/1458502.1458505","volume-title":"WIDM","author":"Senellart P.","year":"2008"},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1145\/1134030.1134044","volume-title":"BIOKDD","author":"Si L.","year":"2005"},{"key":"e_1_2_1_36_1","first-page":"631","volume-title":"WWW","author":"Suchanek F. M.","year":"2009"},{"key":"e_1_2_1_37_1","volume-title":"MSM","author":"van Erp M.","year":"2013"},{"key":"e_1_2_1_38_1","first-page":"160","volume-title":"WCICA","author":"Wang H.","year":"2008"},{"key":"e_1_2_1_39_1","first-page":"200","volume-title":"CoNLL","author":"Wu D.","year":"2003"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2536258.2536261","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,14]],"date-time":"2023-07-14T14:44:28Z","timestamp":1689345868000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2536258.2536261"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,8]]},"references-count":39,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2013,8,29]]}},"alternative-id":["10.14778\/2536258.2536261"],"URL":"https:\/\/doi.org\/10.14778\/2536258.2536261","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2013,8]]}}}