{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,1,9]],"date-time":"2025-01-09T05:28:34Z","timestamp":1736400514557,"version":"3.32.0"},"reference-count":53,"publisher":"Cambridge University Press (CUP)","issue":"4","license":[{"start":{"date-parts":[[2007,12,1]],"date-time":"2007-12-01T00:00:00Z","timestamp":1196467200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2007,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We present a simple and intuitive unsound corpus-driven approximation method for turning unification-based grammars, such as HPSG, CLE, or PATR-II into context-free grammars (CFGs). Our research is motivated by the idea that we can exploit (large-scale), hand-written unification grammars not only for the purpose of describing natural language and obtaining a syntactic structure (and perhaps a semantic form), but also to address several other very practical topics. Firstly, to speed up deep parsing by having a cheap recognition pre-flter (the approximated CFG). Secondly, to obtain an indirect stochastic parsing model for the unification grammar through a trained PCFG, obtained from the approximated CFG. This gives us an efficient disambiguation model for the unification-based grammar. Thirdly, to generate domain-specific subgrammars for application areas such as information extraction or question answering. And finally, to compile context-free language models which assist the acoustic model of a speech recognizer. The approximation method is unsound in that it does not generate a CFG whose language is a true superset of the language accepted by the original unification-based grammar. It is a corpus-driven method in that it relies on a corpus of parsed sentences and generates broader CFGs when given more input samples. Our open approach can be fine-tuned in different directions, allowing us to monotonically come close to the original parse trees by shifting more information into the context-free symbols. The approach has been fully implemented in<jats:sc>JAVA<\/jats:sc>.<\/jats:p>","DOI":"10.1017\/s1351324906004128","type":"journal-article","created":{"date-parts":[[2006,4,26]],"date-time":"2006-04-26T14:47:18Z","timestamp":1146062838000},"page":"317-351","source":"Crossref","is-referenced-by-count":0,"title":["From UBGs to CFGs A practical corpus-driven approach"],"prefix":"10.1017","volume":"13","author":[{"given":"HANS-ULRICH","family":"KRIEGER","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"56","published-online":{"date-parts":[[2007,12,1]]},"reference":[{"key":"S1351324906004128_ref51","unstructured":"Van Tichelen L . (2003) Semantic Interpretation for Speech Recognition. Technical report, W3C Working Draft 1 April 2003 http:\/\/www.w3.org\/TR\/2003\/WD-semantic-interpretation-20030401\/."},{"key":"S1351324906004128_ref50","unstructured":"Uszkoreit H. (1986) Categorial Unifbation Grammars. Proceedings of the llth International Conference on Computational Linguistics, pp. 187\u2013194."},{"key":"S1351324906004128_ref49","doi-asserted-by":"publisher","DOI":"10.3115\/981210.981228"},{"key":"S1351324906004128_ref52","unstructured":"Zeevat H. , Klein E. and Calder J. (1987) Unifbation Categorial Grammar. In: Haddock N. , Klein E. , and Merrill G. , editors, Edinburgh Working Papers in Cognitive Science, 1: Categorial Grammar, Unification Grammar, and Parsing, pp. 195\u2013222. Centre for Cognitive Science, Edinburgh University, UK."},{"key":"S1351324906004128_ref48","first-page":"39","volume-title":"Research on Interactive Acquisition and Use of Knowledge","author":"Shieber","year":"1983"},{"key":"S1351324906004128_ref41","doi-asserted-by":"publisher","DOI":"10.3115\/981967.981984"},{"key":"S1351324906004128_ref37","unstructured":"Neumann G. and Flickinger D. (1999) Learning Stochastic Lexicalized Tree Grammars from HPSG. Technical report, German Research Center for Artifbal Intelligence (DFKI), Saarbriicken."},{"volume-title":"Data-Oriented Parsing","year":"2003","author":"Neumann","key":"S1351324906004128_ref36"},{"key":"S1351324906004128_ref35","doi-asserted-by":"publisher","DOI":"10.1162\/089120100561610"},{"key":"S1351324906004128_ref31","doi-asserted-by":"publisher","DOI":"10.1016\/0885-2308(90)90022-X"},{"key":"S1351324906004128_ref29","unstructured":"Krieger H.-U. , Drozdzynski W. , Piskorski J. , Schafer U. and Xu F. (2004) A Bag of Useful Techniques for Unifbation-Based Finite-State Transducers. Proceedings of KONVENS 2004, pp. 105\u2013112."},{"key":"S1351324906004128_ref25","doi-asserted-by":"publisher","DOI":"10.3115\/1034678.1034750"},{"key":"S1351324906004128_ref24","doi-asserted-by":"publisher","DOI":"10.1007\/1-4020-2295-6_11"},{"volume-title":"Head-Driven Phrase Structure Grammar","year":"1994","author":"Pollard","key":"S1351324906004128_ref44"},{"key":"S1351324906004128_ref27","unstructured":"Kiefer B. , Krieger H.-U. and Prescher D. (2002) A Novel Disambiguation Method For Unifbation-Based Grammars Using Probabilistic Context-Free Approximations. Proceedings of the 19th International Conference on Computational Linguistics, COLING2002."},{"key":"S1351324906004128_ref28","first-page":"199","volume-title":"Proceedings of the 7th International Colloquium on Grammatical Inference, ICGI-2004","author":"Krieger","year":"2004"},{"key":"S1351324906004128_ref7","doi-asserted-by":"publisher","DOI":"10.3115\/974147.974175"},{"key":"S1351324906004128_ref46","doi-asserted-by":"crossref","unstructured":"Rayner M. , Gorrell G. , Hockey B. A. , Dowding J. and Boye J. (2001b) Do CFG-Based Language Models Need Agreement Constraints. Proceedings of the 2nd Conference of the North American Chapter of the ACL, NAACL2001.","DOI":"10.3115\/1073336.1073366"},{"key":"S1351324906004128_ref43","unstructured":"Pollard C. and Sag I. A. (1987) Information-Based Syntax and Semantics. Vol. I: Fundamentals. CSLI Lecture Notes, Number 13. Stanford: Center for the Study of Language and Information."},{"key":"S1351324906004128_ref38","unstructured":"Nuance (2004) Nuance Home http:\/\/www.nuance.com."},{"key":"S1351324906004128_ref14","doi-asserted-by":"crossref","unstructured":"Dowding J. , Hockey B. A. , Gawron J. M. and Culy C. (2001) Practical Issues in Compiling Typed Unification Grammars for Speech Recognition. Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, ACL-2001, pp. 164\u2013171.","DOI":"10.3115\/1073012.1073034"},{"key":"S1351324906004128_ref20","doi-asserted-by":"publisher","DOI":"10.3115\/993268.993278"},{"key":"S1351324906004128_ref26","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-04230-4_20"},{"volume-title":"Computational Models of Speech Pattern Processing","year":"1999","author":"Moore","key":"S1351324906004128_ref33"},{"key":"S1351324906004128_ref3","unstructured":"Becker M. , Drozdzynski W , Krieger H.-U. , Piskorski J. , Schafer U. and Xu F. (2002) SProUT-Shallow Processing with Unifbation and Typed Feature Structures. Proceedings of the International Conference on Natural Language Processing, ICON-2002."},{"key":"S1351324906004128_ref40","article-title":"Towards Systematic Grammar Profiling. Test Suite Technology Ten Years After","volume":"12","author":"Oepen","year":"1998","journal-title":"Journal of Computer Speech and Language"},{"key":"S1351324906004128_ref10","unstructured":"Carroll J. A. (1993) Practical Unification-based Parsing of Natural Language. PhD thesis, University of Cambridge, Computer Laboratory, Cambridge."},{"volume-title":"Compilers: Principles, Techniques, and Tools","year":"1986","author":"Aho","key":"S1351324906004128_ref1"},{"key":"S1351324906004128_ref13","unstructured":"Diagne A. K. , Kasper W. and Krieger H.-U. (1995) Distributed Parsing With HPSG Grammars In Proceedings of the 4th International Workshop on Parsing Technologies, IWPT'95, pp. 79\u201386. (Also available as DFKI Research Report RR-95\u201319.)"},{"volume-title":"The Core Language Engine","year":"1992","author":"Alshawi","key":"S1351324906004128_ref2"},{"key":"S1351324906004128_ref23","unstructured":"Kiefer B. and Krieger H.-U. (2002) A Context-Free Approximation of Head-Driven Phrase Structure Grammar. In: Oepen S. , Flickinger D. , Tsuji J. and Uszkoreit H. , editors, Collaborative Language Engineering. A Case Study in Efficient Grammar-based Processing, pp. 49\u201376. CSLI Publications."},{"key":"S1351324906004128_ref9","unstructured":"Carroll J. , Briscoe T. and Grover C. (1991) A Development Environment for Large Natural Language Grammars. Technical Report 233, Computer Laboratory, Cambridge University, UK."},{"key":"S1351324906004128_ref12","doi-asserted-by":"publisher","DOI":"10.3115\/1073012.1073031"},{"key":"S1351324906004128_ref19","first-page":"173","volume-title":"The Mental Representation of Grammatical Relations","author":"Kaplan","year":"1982"},{"volume-title":"Statistical Language Learning","year":"1993","author":"Charniak","key":"S1351324906004128_ref11"},{"volume-title":"Generalized Phrase Structure Grammar","year":"1985","author":"Gazdar","key":"S1351324906004128_ref16"},{"key":"S1351324906004128_ref30","doi-asserted-by":"crossref","unstructured":"Krieger H.-U. and Schafer U. (1994) 9\u223c2>\u03c8 -A Type Description Language for Constraint-Based Grammars. Proceedings of the 15th International Conference on Computational Linguistics, COLING-94, pp. 893\u2013899. (An enlarged version of this paper is available as DFKI Research Report RR-94-37).","DOI":"10.3115\/991250.991292"},{"key":"S1351324906004128_ref47","unstructured":"Rayner M. , Hockey B. A. , James F. , Bratt E. O. , Goldwater S. and Gawron J. M. (2000) Compiling Language Models from a Linguistically Motivated Unifbation Grammar. Proceedings of the 18th International Conference on Computational Linguistics, COLING 2000, pp. 670\u2013676."},{"key":"S1351324906004128_ref32","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324900002382"},{"key":"S1351324906004128_ref6","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324900002369"},{"key":"S1351324906004128_ref34","unstructured":"Nakazawa T (1995) Construction of LR Parsing Tables for Grammars Using Feature-Based Syntactic Categories. In: Cole J. , Green G. , and Morgan J. , editors, Linguistics and Computation, pp. 199\u2013219. CSLI Lecture Notes."},{"key":"S1351324906004128_ref45","doi-asserted-by":"crossref","unstructured":"Rayner M. , Dowding J. and Hockey B. A. (2001a) A Baseline Method for Compiling Typed Unification Grammars into Context Free Language Models. Proceedings of EUROSPEECH.","DOI":"10.21437\/Eurospeech.2001-219"},{"volume-title":"Introduction to Automata Theory, Languages, and Computation","year":"1979","author":"Hopcroft","key":"S1351324906004128_ref17"},{"key":"S1351324906004128_ref4","unstructured":"Bos J. (2002) Compilation of Unifbation Grammars with Compositional Semantics to Speech Recognition Packages. Proceedings of the 19th International Conference on Computational Linguistics, CO LING 2002, pp. 106\u2013112."},{"key":"S1351324906004128_ref5","first-page":"25","article-title":"Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unifbation-Based Grammars","volume":"19","author":"Briscoe","year":"1993","journal-title":"Computational Linguistics"},{"key":"S1351324906004128_ref42","doi-asserted-by":"crossref","unstructured":"Pereira F. C. and Wright R. N. (1991) Finite-State Approximation of Phrase Structure Grammars. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, ACL-91, pp. 246\u2013255. (An enlarged version is available in E. Roche and Y. Schabes, editors, Finite-State Devices for Natural Language Processing. Cambridge, MA: MIT Press.","DOI":"10.3115\/981344.981376"},{"key":"S1351324906004128_ref8","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511530098"},{"key":"S1351324906004128_ref39","unstructured":"Oepen S. and Callmeier U. (2000) Measure For Measure: Parser Cross-Fertilization. Proceedings of the 6th International Workshop on Parsing Technologies, IWPT 2000, pp. 183\u2013194."},{"volume-title":"Java in a Nutshell","year":"2002","author":"Flanagan","key":"S1351324906004128_ref15"},{"key":"S1351324906004128_ref16a","unstructured":"Goldstein S. D. (1988) Using an Active Chart Parser to Convert Any Context Free Grammar to Backus-Naur Form. Master's thesis, Massachusetts Institute of Technology."},{"key":"S1351324906004128_ref22","unstructured":"Kiefer B. and Krieger H.-U. (2000) A Context-Free Approximation of Head-Driven Phrase Structure Grammar. Proceedings of the 6th International Workshop on Parsing Technologies, IWPT2000, pp. 135\u2013146."},{"key":"S1351324906004128_ref18","unstructured":"Hunt A. and McGlashan S. (2004) Speech Recognition GrammarSpecification Version 1.0. Technical report, W3C Recommendation 16 March 2004 http:\/\/www.w3.org\/TR\/2004\/REC-speech-grammar-20040316\/."},{"key":"S1351324906004128_ref21","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1515\/9783110821895-010","volume-title":"Natural Language Processing and Speech Technology. Results of the 3rd KONVENS Conference","author":"Kasper","year":"1996"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324906004128","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,8]],"date-time":"2025-01-08T17:41:28Z","timestamp":1736358088000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324906004128\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,12]]},"references-count":53,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2007,12]]}},"alternative-id":["S1351324906004128"],"URL":"https:\/\/doi.org\/10.1017\/s1351324906004128","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2007,12]]}}}