{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,24]],"date-time":"2024-07-24T17:17:39Z","timestamp":1721841459920},"reference-count":54,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2005,2,1]],"date-time":"2005-02-01T00:00:00Z","timestamp":1107216000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Language Res Eval"],"published-print":{"date-parts":[[2005,2]]},"DOI":"10.1007\/s10579-005-2695-2","type":"journal-article","created":{"date-parts":[[2005,9,13]],"date-time":"2005-09-13T09:57:27Z","timestamp":1126605447000},"page":"45-64","source":"Crossref","is-referenced-by-count":1,"title":["Accuracy and Suitability: New Challenges for Evaluation"],"prefix":"10.1007","volume":"39","author":[{"given":"Margaret","family":"King","sequence":"first","affiliation":[]}],"member":"297","reference":[{"key":"2695_CR1","unstructured":"ALPAC. (1966) Languages and Machines: Computers in Translation and Linguistics. Report of the Automatic Language Processing Advisory Committee, Division of Behavioral Sciences, National Academy of Sciences, National Research Council Publication 1416, Washington, DC."},{"key":"2695_CR2","unstructured":"Ankherst M. (2001) Human Involvement and Interactivity of the Next Generation\u2019s Data Mining Tools. Workshop on Research Issues in Data Mining and Knowledge Discovery, Data Mining and Knowledge Discovery (DMKD) 2001."},{"key":"2695_CR3","unstructured":"AMTA. (1992) MT Evaluation: Basis for Future Directions. In Proceedings of a workshop held in San Diego, CA. Technical report, Association for Machine Translation in the Americas."},{"key":"2695_CR4","first-page":"445","volume-title":"Some Thoughts on the Reported Results of TREC Information Processing and Management, 38\/3","author":"D.C. Blair","year":"2002"},{"key":"2695_CR5","unstructured":"Blasband M. (1999) Practice of Validation: The ARISE Application of the Eagles Framework. In Proceedings of the European Evaluation of Language Systems Workshop. Hoevelaken, Holland."},{"key":"2695_CR6","first-page":"1325","volume-title":"Methods and Metrics for the Evaluation of Dictation Systems: a case study","author":"M. Canelli","year":"2000"},{"key":"2695_CR7","volume-title":"The Transformational Question Answering System: Description, Operating Experience and Implications","author":"F.J. Damerau","year":"1980"},{"key":"2695_CR8","unstructured":"Doyon J., Taylor K., White J.S. (1998) The DARPA MT Evaluation Methodology: Past and Present. In Proceedings of the AMTA Conference, Philadelphia, PA."},{"key":"2695_CR9","unstructured":"EAGLES. (1996) EAGLES Evaluation of Natural Language Processing Systems. Final Report, EAGLES Evaluation Working group. Report EAG-EWG-PR.2 (ISBN 87-90708-00-8), Center for Sprogteknologi, Copenhagen."},{"key":"2695_CR10","unstructured":"Falkedal K. (1994) Evaluation Methods for Machine Translation Systems: An Historical Overview and a Critical Account. Internal report, ISSCO. Available from ISSCO."},{"key":"2695_CR11","volume-title":"Towards Evaluation of NLP Systems","author":"D. Flickinger","year":"1987"},{"key":"2695_CR12","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1007\/3-540-63438-X_2","volume-title":"Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology","author":"R. Grishman","year":"1997"},{"key":"2695_CR13","doi-asserted-by":"crossref","unstructured":"Grishman R., Sundheim B. (1996) Message Understanding Conference-6: A Brief History. Coling-96.","DOI":"10.3115\/992628.992709"},{"key":"2695_CR14","first-page":"31","volume-title":"A Proposal for Task-based Evaluation of Text Summarization Systems","author":"T.F. Hand","year":"1997"},{"key":"2695_CR15","doi-asserted-by":"crossref","unstructured":"Hawking D., Carswell N., Thistlewaite P., Harman D. (1999) Results and Challenges in Web Search Evaluation. In Proceedings of the Eighth International Conference on World Wide Web, Elsevier.","DOI":"10.1016\/S1389-1286(99)00024-9"},{"key":"2695_CR16","volume-title":"Language Understanding Evaluations: Lessons learned from MUC and ATIS","author":"L. Hirschman","year":"1998a"},{"key":"2695_CR17","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1006\/csla.1998.0102","volume":"12","author":"L. Hirschman","year":"1998b","journal-title":"Computer Speech and Language"},{"key":"2695_CR18","first-page":"1","volume":"16","author":"E.H. Hovy","year":"2002a","journal-title":"Machine Translation"},{"key":"2695_CR19","unstructured":"Hovy E.H., King M., Popescu-Belis A. (2002b) Computer-Aided Specification of Quality Models for Machine Translation Evaluation. LREC-02, pp. 729\u2013753."},{"key":"2695_CR20","unstructured":"ISO\/IEC 9126-1. (2001) Software Engineering \u2013 Product Quality \u2013 Part 1: Quality Model. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR21","unstructured":"ISO\/IEC DTR 9126-2. (2003a) Software Engineering \u2013 Product Quality \u2013 Part 2: External Metrics. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR22","unstructured":"ISO\/IEC CD TR 9126-3. (2003b) Software Engineering \u2013 Product Quality \u2013 Part 3: Internal Metrics. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR23","unstructured":"ISO\/IEC CD 9126-4. (2004) Software Engineering \u2013 Product Quality \u2013 Part 4: Quality in use Metrics. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR24","unstructured":"ISO\/IEC CD 9126-30. (in preparation) Software Engineering \u2013 Software Product Quality Requirements and Evaluation \u2013 Part 30: Quality Metrics \u2013 Metrics reference model and guide. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR25","unstructured":"ISO\/IEC 14598-1. (1999) Information Technology \u2013 Software Product Evaluation \u2013 Part 1: General Overview. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR26","unstructured":"ISO\/IEC 14598-2. (2000a) \u2013 Software Engineering \u2013 Product Evaluation \u2013 Part 2: Planning and Management. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR27","unstructured":"ISO\/IEC 14598-3. (2000b) \u2013 Software Engineering \u2013 Product Evaluation \u2013 Part 3: Process for Developers. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR28","unstructured":"ISO\/IEC 14598-4. (2000c) \u2013 Software Engineering \u2013 Product Evaluation \u2013 Part 4: Process for Acquirers. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR29","unstructured":"ISO\/IEC 14598-5. (1998) Information Technology \u2013 Software Product Evaluation \u2013 Part 5: Process for Evaluators. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR30","unstructured":"ISO\/IEC 14598-6. (2001) \u2013 Software Engineering \u2013 Product Evaluation \u2013 Part 6: Documentation of Evaluation Modules. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR31","unstructured":"ISO\/IEC 9126. (1991) Information Technology \u2013 Software Product Evaluation \u2013 Quality Characteristics and Guidelines for their Use. Geneva, International Organization for Standardization and International Electrotechnical Commission."},{"key":"2695_CR32","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1109\/TSE.1985.231847","volume":"1","author":"M. Jarke","year":"1985","journal-title":"IEEE Transactions on Software Engineering, SE-11"},{"key":"2695_CR33","volume-title":"JEIDA Methodology and Criteria on Machine Translation Evaluation","author":"JEIDA.","year":"1992"},{"key":"2695_CR34","unstructured":"King M., Underwood N. (2004) User Oriented Evaluation of Knowledge Discovery Systems. In Proceedings of a Workshop at LREC-04."},{"key":"2695_CR35","doi-asserted-by":"crossref","unstructured":"Minker W. (2002) Overview on Recent Activities in Speech Understanding and Dialogue Systems Evaluation. International Conference on Speech and Language Processing (ICSLP), Denver, USA.","DOI":"10.21437\/ICSLP.2002-149"},{"key":"2695_CR36","volume-title":"The JEIDA Report on MT Evaluation. Workshop on MT Evaluation: Basis for Future Directions","author":"H. Nomura","year":"1992"},{"issue":"3","key":"2695_CR37","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1017\/S1351324998001995","volume":"4","author":"P. Paggio","year":"1998","journal-title":"Natural Language Engineering"},{"key":"2695_CR38","unstructured":"Papineni K., Roukos S., Ward T., Zhu W.-J. (2001) BLEU: A Method for Automatic Evaluation of MT. Research Report, Computer Science RC22176 (W0109-022), IBM Research Division, T.J. Watson Research Center."},{"key":"2695_CR39","unstructured":"Peters C. (2002) The Contribution of Evaluation: The CLEF Experience. Special Interest Group in Information Retrieval (SIGIR), 2002."},{"issue":"1","key":"2695_CR40","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1017\/S1351324901002583","volume":"7","author":"K. Sparck Jones","year":"2001","journal-title":"In Natural Language Engineering"},{"key":"2695_CR41","volume-title":"Evaluating Natural Language Processing Systems: An Analysis and Review. Lecture Notes in Artificial Intelligence 1083","author":"K. Sparck-Jones","year":"1996"},{"key":"2695_CR42","unstructured":"Spiliopoulou M., Rinaldi F., Black W.J., Zarri G.P., Mueller R.M., Brunzel M. Theodoulidis B., Orphanos G., Hess M., Dowdall J., McNaught J., King M., Persidis A., Bernard L. (2004) Coupling Information Extraction and Data Mining for Ontology Learning in Parmenides. RIAO 2004, Avignon."},{"key":"2695_CR43","unstructured":"Starlander M., Popescu-Belis A. (2002) Corpus-Based Evaluation of a French Spelling and Grammar Checker. LREC-02, Las Palmas de Gran Canaria, Spain. pp.262\u2013274."},{"key":"2695_CR44","unstructured":"TEMAA. (1996) TEMAA Final Report. Technical report LRE-62-070 (March 1996), Center for Sprogteknologi, Copenhagen, Denmark."},{"key":"2695_CR45","unstructured":"TREC. (2005) Text Retrieval Conference (TREC) TREC-9 Proceedings. Available from http:\/\/trec.nist.gov."},{"key":"2695_CR46","unstructured":"VanSlype G. (1979) Critical Study of Methods for Evaluating the Quality of MT. Technical Report BR 19142, European Commission, Directorate for General Scientific and Technical Information Management (DG XIII) Available from www.issco.unige.ch\/projects\/isle."},{"key":"2695_CR47","unstructured":"Vasilakopoulos A., Bersani M., Black B. (2004) A Suite of Tools for Marking Up Textual Data for Temporal Text Mining Scenarios. LREC-04, Lisbon."},{"key":"2695_CR48","doi-asserted-by":"crossref","unstructured":"Voorhees E.M. (2003) Evaluating the Evaluation: A Case Study Using the TREC 2002 Question Answering Track. HLT-NAAL.","DOI":"10.3115\/1073445.1073479"},{"key":"2695_CR49","unstructured":"Voorhees E.M. (2000) The Evaluation of Question-Answering Systems: Lessons Learned from the TREC QA Track. LREC-2000, Athens."},{"key":"2695_CR50","unstructured":"White J.S., O\u2019Connell T.A. (1994) The DARPA MT Evaluation Methodologies: Evolution, Lessons and Future Approaches. In Proceedings of the First Conference of the Association for Machine Translation in the Americas (AMTA-94). Columbia, Maryland."},{"key":"2695_CR51","unstructured":"White J.S., Taylor K.B. (1998) A Task-Oriented Evaluation Metric for Machine Translation, LREC-98."},{"key":"2695_CR52","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1145\/1499586.1499695","volume":"42","author":"W.A. Woods","year":"1973","journal-title":"AFIPS"},{"key":"2695_CR53","volume-title":"Comparing Two User-Oriented Database Query Languages: A Field Study Technical Report HPL-ISC-89-060","author":"S. Whittaker","year":"1989"},{"issue":"Suppl. 1","key":"2695_CR54","doi-asserted-by":"crossref","first-page":"i331","DOI":"10.1093\/bioinformatics\/btg1046","volume":"19","author":"A.S. Yeh","year":"2003","journal-title":"Bioinformatics"}],"container-title":["Language Resources and Evaluation"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10579-005-2695-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10579-005-2695-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10579-005-2695-2","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,4]],"date-time":"2023-05-04T12:43:40Z","timestamp":1683204220000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10579-005-2695-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,2]]},"references-count":54,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2005,2]]}},"alternative-id":["2695"],"URL":"https:\/\/doi.org\/10.1007\/s10579-005-2695-2","relation":{},"ISSN":["1574-020X","1572-8412"],"issn-type":[{"value":"1574-020X","type":"print"},{"value":"1572-8412","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,2]]}}}