{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,1,23]],"date-time":"2024-01-23T23:29:11Z","timestamp":1706052551397},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2008,8]]},"abstract":"<jats:p>The number of potentially-related data resources available for querying --- databases, data warehouses, virtual integrated schemas --- continues to grow rapidly. Perhaps no area has seen this problem as acutely as the life sciences, where hundreds of large, complex, interlinked data resources are available on fields like proteomics, genomics, disease studies, and pharmacology. The schemas of individual databases are often large on their own, but users also need to pose queries across multiple sources, exploiting foreign keys and schema mappings. Since the users are not experts, they typically rely on the existence of pre-defined Web forms and associated query templates, developed by programmers to meet the particular scientists' needs. Unfortunately, such forms are scarce commodities, often limited to a single database, and mismatched with biologists' information needs that are often context-sensitive and span multiple databases.<\/jats:p>\n          <jats:p>\n            We present a system with which a non-expert user can author new query templates and Web forms, to be reused by anyone with related information needs. The user poses keyword queries that are matched against source relations and their attributes; the system uses sequences of associations (e.g., foreign keys, links, schema mappings, synonyms, and taxonomies) to create multiple ranked queries linking the matches to keywords; the set of queries is attached to a Web query form. Now the user and his or her associates may pose specific queries by filling in parameters in the form. Importantly, the answers to this query are ranked and annotated with data provenance, and the user provides\n            <jats:italic>feedback<\/jats:italic>\n            on the utility of the answers, from which the system ultimately learns to assign costs to sources and associations according to the user's specific information need, as a result changing the ranking of the queries used to generate results. We evaluate the effectiveness of our method against \"gold standard\" costs from domain experts and demonstrate the method's scalability.\n          <\/jats:p>","DOI":"10.14778\/1453856.1453941","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"785-796","source":"Crossref","is-referenced-by-count":37,"title":["Learning to create data-integrating queries"],"prefix":"10.14778","volume":"1","author":[{"given":"Partha Pratim","family":"Talukdar","sequence":"first","affiliation":[{"name":"University of Pennsylvania, Philadelphia, PA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marie","family":"Jacob","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, PA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Muhammad Salman","family":"Mehmood","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, PA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Koby","family":"Crammer","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, PA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zachary G.","family":"Ives","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, PA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fernando","family":"Pereira","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, PA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sudipto","family":"Guha","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, PA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2008,8]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Modern Information Retrieval","author":"Baeza-Yates R. A.","year":"1999","unstructured":"R. A. Baeza-Yates and B. A. Ribeiro-Neto . Modern Information Retrieval . ACM Press\/Addison-Wesley , 1999 . R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern Information Retrieval. ACM Press\/Addison-Wesley, 1999."},{"key":"e_1_2_1_2_1","volume-title":"VLDB","author":"Balmin A.","year":"2004","unstructured":"A. Balmin , V. Hristidis , and Y. Papakonstantinou . Objectrank: Authority-based keyword search in databases . In VLDB , 2004 . A. Balmin, V. Hristidis, and Y. Papakonstantinou. Objectrank: Authority-based keyword search in databases. In VLDB, 2004."},{"key":"e_1_2_1_3_1","first-page":"3","article-title":"Global discriminative training for higher-accuracy computational gene prediction","author":"Bernal A.","year":"2007","unstructured":"A. Bernal , K. Crammer , A. Hatzigeorgiou , and F. Pereira . Global discriminative training for higher-accuracy computational gene prediction . PLoS Computational Biology , 3 , 2007 . A. Bernal, K. Crammer, A. Hatzigeorgiou, and F. Pereira. Global discriminative training for higher-accuracy computational gene prediction. PLoS Computational Biology, 3, 2007.","journal-title":"PLoS Computational Biology"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/876875.879034"},{"key":"e_1_2_1_5_1","volume-title":"WebDB","author":"Botev C.","year":"2005","unstructured":"C. Botev and J. Shanmugasundaram . Context-sensitive keyword search and ranking for XML . In WebDB , 2005 . C. Botev and J. Shanmugasundaram. Context-sensitive keyword search and ranking for XML. In WebDB, 2005."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btm088"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/645504.656274"},{"key":"e_1_2_1_8_1","volume-title":"ICDE","author":"Carey M. J.","year":"2004","unstructured":"M. J. Carey . BEA Liquid Data for WebLogic : XML-based enterprise information integration . In ICDE , 2004 . M. J. Carey. BEA Liquid Data for WebLogic: XML-based enterprise information integration. In ICDE, 2004."},{"key":"e_1_2_1_9_1","volume-title":"WebDB '00","author":"Carey M. J.","year":"2000","unstructured":"M. J. Carey , D. Florescu , Z. G. Ives , Y. Lu , J. Shanmugasundaram , E. Shekita , and S. Subramanian . XPERANTO: Publishing object-relational data as XML . In WebDB '00 , 2000 . M. J. Carey, D. Florescu, Z. G. Ives, Y. Lu, J. Shanmugasundaram, E. Shekita, and S. Subramanian. XPERANTO: Publishing object-relational data as XML. In WebDB '00, 2000."},{"key":"e_1_2_1_10_1","volume-title":"CIDR","author":"Chandrasekaran S.","year":"2003","unstructured":"S. Chandrasekaran , O. Cooper , A. Deshpande , M. J. Franklin , J. M. Hellerstein , W. Hong , S. Krishnamurthy , S. Madden , V. Raman , F. Reiss , and M. A. Shah . TelegraphCQ: Continuous dataflow processing for an uncertain world . In CIDR , 2003 . S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, and M. A. Shah. TelegraphCQ: Continuous dataflow processing for an uncertain world. In CIDR, 2003."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/276304.276323"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1248547.1248566"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/1316689.1316764"},{"key":"e_1_2_1_15_1","first-page":"19","article-title":"Reduction tests for the steiner problem in graphs","author":"Duin C.","year":"1989","unstructured":"C. Duin and A. Volgenant . Reduction tests for the steiner problem in graphs . Netw. , 19 , 1989 . C. Duin and A. Volgenant. Reduction tests for the steiner problem in graphs. Netw., 19, 1989.","journal-title":"Netw."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.9"},{"key":"e_1_2_1_17_1","volume-title":"Computers and Intractability: A Guide to the Theory of NP-Completeness","author":"Garey M. R.","year":"1979","unstructured":"M. R. Garey and D. S. Johnson . Computers and Intractability: A Guide to the Theory of NP-Completeness . W. H. Freeman , 1979 . M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/775152.775166"},{"key":"e_1_2_1_19_1","volume-title":"VLDB","author":"Green T. J.","year":"2007","unstructured":"T. J. Green , G. Karvounarakis , Z. G. Ives , and V. Tannen . Update exchange with mappings and provenance . In VLDB , 2007 . Amended version available as Univ. of Pennsylvania report MS-CIS-07-26. T. J. Green, G. Karvounarakis, Z. G. Ives, and V. Tannen. Update exchange with mappings and provenance. In VLDB, 2007. Amended version available as Univ. of Pennsylvania report MS-CIS-07-26."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1265530.1265535"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872762"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2003.1260817"},{"key":"e_1_2_1_23_1","volume-title":"VLDB","author":"Hristidis V.","year":"2002","unstructured":"V. Hristidis and Y. Papakonstantinou . Discover: Keyword search in relational databases . In VLDB , 2002 . V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, 2002."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304209"},{"key":"e_1_2_1_25_1","volume-title":"VLDB","author":"Kacholia V.","year":"2005","unstructured":"V. Kacholia , S. Pandit , S. Chakrabarti , S. Sudarshan , R. Desai , and H. Karambelkar . Bidirectional expansion for keyword search on graph databases . In VLDB , 2005 . V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, 2005."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376756"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1142351.1142377"},{"key":"e_1_2_1_28_1","first-page":"419","article-title":"The Plasmodium genome database: Designing and mining a eukaryotic genomics resource","author":"Kissinger J. C.","year":"2002","unstructured":"J. C. Kissinger , B. P. Brunk , J. Crabtree , M. J. Fraunholz , B. Gajria , A. J. Milgram , D. S. Pearson , J. Schug , A. Bahl , S. J. Diskin , H. Ginsburg , G. R. Grant , D. Gupta , P. Labo , L. Li , M. D. Mailman , S. K. McWeeney , P. Whetzel , C. J. Stoeckert , Jr., and D. S. Roos . The Plasmodium genome database: Designing and mining a eukaryotic genomics resource . Nature , 419 , 2002 . J. C. Kissinger, B. P. Brunk, J. Crabtree, M. J. Fraunholz, B. Gajria, A. J. Milgram, D. S. Pearson, J. Schug, A. Bahl, S. J. Diskin, H. Ginsburg, G. R. Grant, D. Gupta, P. Labo, L. Li, M. D. Mailman, S. K. McWeeney, P. Whetzel, C. J. Stoeckert, Jr., and D. S. Roos. The Plasmodium genome database: Designing and mining a eukaryotic genomics resource. Nature, 419, 2002.","journal-title":"Nature"},{"key":"e_1_2_1_29_1","first-page":"18","article-title":"A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem","author":"Lawler E. L.","year":"1972","unstructured":"E. L. Lawler . A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem . Management Science , 18 , 1972 . E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18, 1972.","journal-title":"Management Science"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066173"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1005566.1005569"},{"key":"e_1_2_1_32_1","volume-title":"Online learning of approximate dependency parsing algorithms","author":"McDonald R.","year":"2006","unstructured":"R. McDonald and F. Pereira . Online learning of approximate dependency parsing algorithms . In European Association for Computational Linguistics , 2006 . R. McDonald and F. Pereira. Online learning of approximate dependency parsing algorithms. In European Association for Computational Linguistics, 2006."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/511446.511524"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/604045.604070"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/s007780100057"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1378773.1378792"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2008.4497497"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1002\/net.3230170203"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01758765"},{"key":"e_1_2_1_40_1","unstructured":"L. Wolsey. Integer Programming. Wiley-Interscience 1998.  L. Wolsey. Integer Programming. Wiley-Interscience 1998."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02612335"},{"key":"e_1_2_1_42_1","volume-title":"Finding the k shortest loopless paths in a network. Management Science, 18(17)","author":"Yen J. Y.","year":"1971","unstructured":"J. Y. Yen . Finding the k shortest loopless paths in a network. Management Science, 18(17) , 1971 . J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, 18(17), 1971."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1453856.1453941","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:14:47Z","timestamp":1672226087000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1453856.1453941"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,8]]},"references-count":41,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,8]]}},"alternative-id":["10.14778\/1453856.1453941"],"URL":"https:\/\/doi.org\/10.14778\/1453856.1453941","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2008,8]]}}}